Showing posts with label qso. Show all posts
Showing posts with label qso. Show all posts

Wednesday, January 11, 2012

119th AAS Meeting, Austin Texas

Sorry for the lack of posting lately. Between job applications, the holidays, and conference travel I've been pretty busy, and not as productive research-wise.

I'm currently at the AAS meeting in Austin. It's been a lot of fun. I've especially enjoyed some of the career development talks they've had here. There was a great discussion about alternative paths for astronomers / physicists -- things like journalism, science policy, education, outreach, and science communication. I had some productive/helpful conversations with people who work at primarily teaching organizations, as well as people who are do science outreach.

I attended a wonderful talk by Jean-luc Doumont on effective scientific communication. He apparently is speaking at Berkeley in April. I will encourage all my friends to go to this.

I'm currently on a "talking tour." Basically visiting a bunch of universities talking about my research and trying to convince them to hire me. It's actually a lot of fun (much more fun than I thought it would be).

Here is my AAS poster on my recent research on Luminosity Dependent Quasar Clustering (click here for larger image):

Thursday, November 3, 2011

Talk at Brookhaven

Yesterday I gave the first research seminar of my life. It was a 50 minute talk at Brookhaven National Labs. Overall, I think it went well. The talk was to a small group of research scientists. About half were cosmologists, the other half were physicists of another flavor.

My hosts Erin Sheldon and Anže Slosar gave me some really helpful feed back:
  1. I wasn't expecting such a mixed audience, and I didn't give enough introduction to BOSS, and the data. For instance I didn't directly discuss why quasars are hard to target. I also didn't emphasize that the inputs to the likelihood were the photometric fluxes. I should have talked more about the difference between photometric and spectroscopic data. But in general, I need to be more aware of the audience and beef up the introduction.
  2. I went through the full derivation of the likelihood, and this wasn't necessary.
  3. I was nervous and relied on notecards. Anže said this made it look like I didn't understand what I was talking about. Basically, I need to practice the talk a bunch so I don't need cards.
  4. I mispoke and talked about the first BAO detection. I said it was from SDSS-II, it was actually from SDSS-I.
  5. In general I didn't feel prepared to answer all questions, I need to go through the talk and make sure I fully understand everything I am posting -- for instance how was the first BAO detection made?
The talk was more fun than I expected, and has definitely eased my nerves in terms of my quals. I think I can use a similar version of this talk for my quals, which is good.

Wednesday, September 7, 2011

More streamlining

Martin White keeps telling me I should streamline my code/runs so that it is really easy to re-run with new data/subsets.

I've been working on doing that for running the correlation functions, and running the error analysis. I've written code:

../Jessica/qsobias/Correlate/runCrossCorrelationJackknifeMulti.py

That takes the run directories, and qso-file names as inputs (along with other constants for the correlation) and runs the correlation on all those different files at once.

Then there is code that compares the correlation functions and makes a bunch of pretty plots. I'm running it for the first time now. Hope it all works!

Oh, and I need to talk to Martin White about what to do when certain realizations of the bootstrap/jackknife are negative. This causes problems with fitting the power-law, and also isn't physical. Currently I re-run the realization if it has a negative value for the correlation function, but I don't think this the right thing to do, as it is biasing the bootstrap to have higher clustering values.


Black Hole Masses

I had a nice chat with Benny Trakhtenbrot this morning about calculating black hole masses and his research. He had some useful suggestions. First he suggested that I look at the McLure and Dunlop 2004 paper instead of the Vestergaard and Peterson 2006 for calculating the black hole masses. I should ask Martin if he agrees.

He also suggested that I talk to Yue Shen about his catalogs of black hole masses (or maybe it was line widths). I don't know how different this is to the catalog that Ian McGreer has given us.

He also thought that perhaps we might want to do a separation by Eddington ratio and see if there is a clustering dependence there.

He was very helpful, and overall a productive meeting.

I've been working on trying to get a stronger clustering signal separation. So far divided the QSO sample into even brighter/dimmer sets. As well as applying a uniform redshift selection to my two samples to see if what I am measuring is actually a redshift separation, not luminosity. Fun fun fun!

Saturday, August 13, 2011

SDSS-III Collaboration Meeting - Vanderbilt

Exciting results! Luminosity Dependent Clustering of Quasars. More details here.


Tuesday, June 14, 2011

Temperature Plots in Python

I figured out how to do temperature plots in python. The only problem with below is that the numbers on the axis are wrong. I if I set 'extent' to be the actual extent of the axis (in this case x = [0.5,1], y = [-28,-18]) then it plots a very skinny, tall plot. So I had to correct the axis by hand later.


import matplotlib
import numpy as np
import matplotlib.cm as cm
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt

redbins = linspace(start = 0.5, stop = 1.0, num = 51)
rcenter = (redbins[0:-1]+redbins[1:])/2.0
mbins = linspace(start = -28, stop = -18, num = 51)
mcenter = (mbins[0:-1]+mbins[1:])/2.0


H, xedges,yedges = N.histogram2d(qsoM[dr7cut],qsored[dr7cut],bins=(mbins,redbins))
X = rcenter
Y = mcenter
Z = H

#plt.imshow(H, extent=[-3,3,-3,3], interpolation='bilinear', origin='lower')
#plt.colorbar()


plt.imshow(H, extent=[-3,3,-3,3], interpolation='nearest', origin='lower')
plt.colorbar()

xlabel('redshift (z)')
ylabel('absolute magnitude (i-band)')
title('SDSS DR7 Quasar Density')

The above code is also in the following log file: ../logs/110614log.py

Thursday, June 9, 2011

More QSO-Galaxy Cross Correlations

Here are the QSO-Galaxy Cross-Correlations comparing high/low redshift QSOs. I divide the QSO sample in half (0.5 < z < 0.77) & (0.77 < z <1.0):

(plots on wiki)

Here are the QSO-Galaxy Cross Correlations comparing bright/dim QSOs. I divide the QSO sample into the 1/3 brightest (20.20 < g-mag < 16.17) and 1/3 dimest (25.87

(plots on wiki)

It's pretty interesting that there seems to be a difference in the correlation functions for bright/dim objects. I am not sure if we should expect this. The redshift distribution of these objects is similar, so that points to the fact that these objects are actually brighter (not sure further away). I'm hoping to do this in terms of high/low luminosity... but I need to figure out the absolute magnitudes of these objects first. Have pinged Ross about this. More to come...

Monday, June 6, 2011

Testing QSO-Galaxy Cross CF

I spent the last couple days making sure that I trusted the correlation function code I had developed for this project. I compared it to the results from a (slow) sphere-match code on a small data set.

The following are the testing codes (also here ../logs/cross_corr_check.idl). In IDL using spherematch:

; Read in data/random files
readcol,'cfht_data_tiny.dat',rad1,decd1, format = ('D,D')
readcol, 'qso_data_tiny.dat',rad2,decd2,cd, format = ('D,D,D') ;cd is comoving distance of qsos
readcol,'cfht_randoms_tiny.dat',rar1,decr1, format = ('D,D')

; Find number of objects in files
nr1 = n_elements(rar1)
nd1 = n_elements(rad1)
nd2 = n_elements(rad2)

; Correlate out to a large theta to get all pairs
thetamax = 50.0

print,'Starting cross-correlation:'
print,'Estimating DD...'
spherematch,rad1,decd1,rad2,decd2,thetamax,ind1a,ind1b,dist_dd,maxmatch=0

; Convert from angular separation to comoving distance separation
this_dd = 2*sin(dist_dd*!pi/360)*cd[ind1b]

;Bins go from 0.1 to 10 with 15 bins.
corrmin = 0.1D
corrmax = 10.0D
nbins = 15.0D

; Find bins lower, upper and centers
bins_lower = (corrmax-corrmin)/(nbins)*findgen(nbins)+corrmin
bins_upper = (corrmax-corrmin)/(nbins)*(findgen(nbins)+1)+corrmin
rmean = fltarr(nbins)
for i = 0,(nbins-1) do rmean[i] = (bins_lower[i]+bins_upper[i])/2.

; Bin the DD separation distances
dd = fltarr(nbins)
for i = 0,(nbins-1) do dd[i] = n_elements(where(this_dd gt bins_lower[i] AND this_dd le bins_upper[i]))

print,'Estimating DR...'
spherematch,rar1,decr1,rad2,decd2,thetamax,ind1,ind2,dist_dr1,maxmatch=0

this_dr = 2*sin(dist_dr1*!pi/360)*cd[ind2]
dr = fltarr(nbins)
for i = 0,(nbins-1) do dr[i] = n_elements(where(this_dr ge bins_lower[i] AND this_dr le bins_upper[i]))

corr1 = 1L*dd/dr*1L*(nd2*nr1)/(1L*nd1*nd2)-1L

for i = 0,(nbins-1) do print, rmean[i], corr1[i]

Separation omega
0.430000 -0.115686
1.09000 -0.104478
1.75000 -0.120804
2.41000 -0.0914845
3.07000 -0.0393971
3.73000 -0.0268416
4.39000 0.0134841
5.05000 0.0596094
5.71000 0.0227162
6.37000 0.102554
7.03000 0.0929233
7.69000 0.0900670
8.35000 0.0591398
9.01000 0.0284724
9.67000 0.0598689

(Note that these "tiny" files only have 6 qsos and 9000 galaxies, so the correlation function values are very noisy, this was just to test and I used small files to)

By comparison I also have the python/C code which runs much faster (../Jessica/qsobias/Correlate/runCorrelation.py):

import numpy as N

from pylab import *

from correlationFunctions import *


#------------------------------------------------------------------------

# Create file names (tiny catalogs)

#------------------------------------------------------------------------

workingDir = 'tinyrun'

makeworkingdir(workingDir)

galaxyDataFile, qsoDataFile, randomDataFile, corr2dCodefile, argumentFile, runConstantsFile = makeFileNamesTiny(workingDir)


oversample = 5. # Amount that randoms should be oversampled

corrBins = 25.0 # Number of correlation bins (+1)

mincorr = 0.1 # (Mpc/h comoving distance separation) Must be great than zero if log-binning

maxcorr = 10.0 # (Mphc/h comoving distance separation)

convo = 180./pi # conversion from degrees to radians

tlogbin = 1 # = 0 for uniform spacing, = 1 for log spacing in theta


#------------------------------------------------------------------------

# Write run constants to a file

#------------------------------------------------------------------------


writeRunConstantsToFile(runConstantsFile, galaxyDataFile, qsoDataFile, \

randomDataFile, corr2dCodefile, argumentFile, oversample, corrBins, \

mincorr, maxcorr, tlogbin)


#------------------------------------------------------------------------

# Compute the Angular Correlation Function

#------------------------------------------------------------------------

runcrossCorrelation(workingDir, argumentFile, corr2dCodefile, galaxyDataFile,\

qsoDataFile, randomDataFile, mincorr, maxcorr, corrBins, tlogbin)


# separation (Mpc/h) crossw (Mpc/h)

0.4300000000 -0.1156862745

1.0900000000 -0.1044776119

1.7500000000 -0.1208039566

2.4100000000 -0.0914845135

3.0700000000 -0.0393970538

3.7300000000 -0.0268417043

4.3900000000 0.0134841235

5.0500000000 0.0596093513

5.7100000000 0.0227161938

6.3700000000 0.1025539385

7.0300000000 0.0929232804

7.6900000000 0.0900670231

8.3500000000 0.0591397849

9.0100000000 0.0284723490

9.6700000000 0.0598689436


As you can see the correlation functions match!

Wednesday, June 1, 2011

Data in Order

I've finished creating the randoms for the CFHT data. They are here:

../data/jessica/alexieData/Catalogs/cfht_random_catalog.dat

So now I have the masked galaxy data:
../data/jessica/alexieData/Catalogs/cfht_data.dat

The qso data:
../data/jessica/alexieData/Catalogs/qso_data.dat

The code to make these is catalogs are here:
randoms: ../Jessica/qsobias/Analysis/make_random_catalog.pro
galaxy data: ../Jessica/qsobias/Analysis/doall_cfht_se.pro
qso data: ../Jessica/qsobias/Analysis/make_qso_catalog.py

In my meeting last week, I realized that I had been calculating the correlation function incorrectly. I need to multiply the angle by the comoving distance to get a physical separation, not angular separation. This requires tweaking the c-code a bit to change that.

Tuesday, May 24, 2011

Data Handling

For the last few days I've been doing a bunch of data handling for this new project. Below is what I've done:

Alexie wrote a pipeline to collate data and apply masks. The code is here: .../qsobias/Analysis/doall_cfht_se.pro

I added some code to remove the repeats: ../qsobias/Analysis/remove_repeats.pro

To just start us off I wrote some code to trim the data and make randoms in the entire footprint (because we don't have masks yet). This code is in the following log file: .../logs/110522log.pro

;I start with the galaxy catalog from Alexie: ../Catalogs/cs82_all_cat6.fits
;I remove repeats:

cat6 = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cs82_all_cat6.fits'
cat7 = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cs82_all_cat7.fits'

infile = cat6
outfile = cat7
remove_repeats,infile,outfile

;Read in file with repeats removed

infile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cs82_all_cat7.fits'
str=mrdfits(infile,1)

;Do some simple geometry cuts on ra, dec, and z
deltacut = where(str.DELTA_J2000 gt -1.0 and str.DELTA_J2000 lt 0.94)
str = str[deltacut]

alphacut = where(str.ALPHA_J2000 gt -42.5 and str.ALPHA_J2000 lt 45.0)
str = str[alphacut]

zcut = where(str.ZPHOT le 1.0)
str = str[zcut]

;Do some cuts that Alexie recommends to make sure we are getting galaxies, inside the mask, and only down to mag = 23.7
gooddatacut = where(str.mask eq 0 and str.mag_auto lt 23.7 and str.class_star lt 0.9)
str = str[gooddatacut]

;Outfile data to files
outfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cs82_all_cat8.fits'
mwrfits, str, outfile, /create

;This is the galaxy catalog I am using for the preliminary correlation function test
thisfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cfht_data.dat'
writecol,thisfile,str.alpha_j2000,str.delta_j2000,str.zphot

;Now to do the same to the BOSS Data
;This code can be found in 110519log.py

;Read in BOSS Galaxy Data
qsofile = '/home/jessica/qsobias/Stripe_82_QSOs_unique.dat'
readcol,qsofile,ra,dec,z,gmag,format='(D,D,D,D,X)'

;Remove duplicates
spherematch, ra, dec, ra, dec, 2./3600, m1, m2, maxmatch=0
dups = m1[where(m1 gt m2)]
good = [indgen(n_elements(ra))*0]+1
good[dups]=0
ra = ra[where(good)]
dec = dec[where(good)]
z = z[where(good)]
gmag = gmag[where(good)]

thisfile = '/home/jessica/qsobias/QSOs_unique.dat'
writecol,thisfile,ra,dec,z,gmag

#Now in python

import numpy as N
from pylab import *
from dataplay import *

galfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cfht_data.dat'
gal=N.loadtxt(galfile,comments='#')
galra = gal[:,0]
galdec = gal[:,1]
galz = gal[:,2]

#Make Cuts on QSOs
qsofile = '/home/jessica/qsobias/QSOs_unique.dat'
qso=N.loadtxt(qsofile,comments='#')
qsora = qso[:,0]
qsodec = qso[:,1]
qsoz = qso[:,2]
qsog = qso[:,3]


#Dec between -1 and 1
deccut = N.where((qsodec < 0.94) & (qsodec > -1.0)) #make sure they are inside the cfht mask

qsora = qsora[deccut]
qsodec = qsodec[deccut]
qsoz = qsoz[deccut]
qsog = qsog[deccut]

#Redshift between 0 and 1
zcut = N.where((qsoz >= 0.0) & (qsoz < 1.0)) qsora = qsora[zcut] qsodec = qsodec[zcut] qsoz = qsoz[zcut] qsog = qsog[zcut] #RA range (-42.5,45) highra = N.where(qsora > 300)
qsora[highra] = qsora[highra] - 360.
racut = N.where((qsora < 45.0) & (qsora > -42.5))

qsora = qsora[racut]
qsodec = qsodec[racut]
qsoz = qsoz[racut]
qsog = qsog[racut]

#Write masked data to files
thisfile = '/home/jessica/qsobias/QSO_data_masked.dat'
writeQSOsToFile(qsora,qsodec,qsoz,qsog,thisfile)

#This is the QSO data file
thisfile = '/home/jessica/qsobias/QSO_data.dat'
writeQSOsToFile2(qsora,qsodec,qsoz,thisfile)

#Plot the galaxies and qsos
plot(galra, galdec, 'b,', label = 'galaxy data')
plot(qsora, qsodec, 'r,', label = 'qso data')

xlabel('ra')
ylabel('dec')
title('RA & Dec of Data')
from pylab import *
legend(loc=1)

# Translate into comoving distance (from redshift, ra, dec)
qsox,qsoy,qsoz,qsocd=raDecZtoXYZ(qsora,qsodec,qsoz)

min(qsocd)
0 #Mpc/h
max(qsocd)
2312.67 #Mpc/h

# Plot redshift distribution of galaxies and qsos

bins = linspace(start = 0.01, stop = 1.0, num = 20)
histresult = histogram(qsoz, bins=bins)
qson = histresult[0]
plot(bins, qson, 'b.-', label = 'qso data')

histresult = histogram(galz, bins=bins)
galn = histresult[0]
plot(bins, galn/1000, 'r.-', label = 'galaxy data (k)')

xlabel('redshift bin (z)')
ylabel('number of QSO/gal in bin')
title('Redshift Distribution of Galaxies and QSOs')
from pylab import *
legend(loc=2)

;Back to IDL
;Make random catalog
; code in ../log/110519log.py

;Read in galaxy data
thisfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cfht_data.dat'
readcol, thisfile, ra, dec, z, format='(F,F,F)'

;Make 10X as many randoms as we have galaxies
nrandoms = n_elements(ra)*10

;Make random ra / dec in the Stripe 82 range
randomdec = 90.0-acos(RANDOMU(S, 2*nrandoms)*0.03385776-0.01745246)*180./!pi
deccut = where(randomdec GT -1.0 AND randomdec LT 0.94)
randomdec = randomdec[deccut]
randomdec = randomdec[0:nrandoms-1]
randomra = RANDOMU(S, nrandoms)*87.5 - 42.50

; Make file of ra, dec of spectroscopic randoms inside mask
thisfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cfht_randoms.dat'
writecol,thisfile, randomra, randomdec

;Calculate correlation function using spherematch (should actually use c-code because this takes a long time

;Read in galaxy data
thisfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cfht_data.dat'
readcol, thisfile, gra, gdec, gz, format='(F,F,F)'

;Read in random data
thisfile = '/clusterfs/riemann/data/jessica/alexieData/Catalogs/cfht_randoms.dat'
readcol, thisfile, rra, rdec, format='(F,F)'

;Read in QSO data
thisfile = '/home/jessica/qsobias/QSO_data.dat'
readcol, thisfile, qra, qdec, qz, format='(F,F,F)'

;Make theta bins
theta = findgen(11)/2.-5.
logtheta = 10D^(theta)
maxtheta = max(logtheta)
bins = logtheta
nbins = n_elements(bins) - 1

nr = n_elements(rra)
nq = n_elements(qra)
ng = n_elements(gra)

;Find distance separations using sphere match
print,'Starting cross-correlation:'
print,'Estimating DD...'
spherematch,qra,qdec,gra,gdec,maxtheta,ind1a,ind1b,dist_dd,max=0

print,'Estimating DR...'
spherematch,qra,qdec,rra,rdec,maxtheta,ind1,ind2,dist_dr,max=0
print,'Done cross-correlating.'

;Find number of pairs per bin
bins_lower = bins[0:nbins-1]
bins_upper = bins[1:nbins]

dd = fltarr(nbins)
dr = fltarr(nbins)
rmean = fltarr(nbins)

for i = 0,(nbins-1) do dd[i] = n_elements(where(dist_dd gt bins_lower[i] AND dist_dd le bins_upper[i]))

for i = 0,(nbins-1) do dd[i] = n_elements(where(dist_dr gt bins_lower[i] AND dist_dr le bins_upper[i]))

;Calculate the correlation function
corr1 = dd/dr*(nq*nr/(nq*ng)) - 1.0

Friday, May 20, 2011

New Project

Because things aren't yet working with my reconstruction project, David has suggested I work on something else in parallel so that if I end up not being able to get a result with the reconstruction, I still have another option for my thesis.

Here is the new project description:
~~~~~~~~~~~~

SDSS-III Project 127: Cross-correlation of BOSS spectroscopic quasars on Stripe 82 with photometric CFH galaxies.

Participants:
Jessica Kirkpatrick
Martin White, David Schlegel, Nic Ross, Alexie Leauthaud, Jean-Paul Kneib

Categories: BOSS

Project Description:
We plan to measure the intermediate scale clustering of low redshift BOSS quasars along Stripe 82 by cross-correlation against the photometric galaxy catalog from the CFH i-band imaging on Stripe 82. There are approximately 1,000 BOSS quasars with 0.5
<z<1 and just under 6 million galaxies in the -43<RA<43 and -1<DEC<1 region brighter than i=23.5, which should lead to a strong detection of clustering over approximately 2 orders of magnitude in length scale. The geometry of the stripe suggests errors on the cross-correlation can be efficiently obtained by jackknife or bootstrap sampling the ~50 2x2 degree blocks.

We intend to split the QSO sample in luminosity and black hole mass. We plan to estimate the BH mass using the fits from Vestergaard and Peterson, knowing the Hbeta line width and the continuum luminosity at 5100A. The pipeline measures the former, we plan to measure the latter from the photometry calibrated with Ian McGreer's mocks.

If we use the QG/QR-1 estimator we do not need the quasar mask, only that of the galaxies which will be provided by the CS82 team in the form of a pixelized mask from visual inspection. The dN/dz of the galaxies is known from photometric redshifts plus spectroscopic training sets. While the galaxies could be split in photometric redshift bins, the gains from doing so are not expected to be large, so our initial investigations will simply cross-correlate the quasars with the magnitude limited galaxy catalog.

Along with this project we will submit a request for EC status for:

Ludo van Waerbeke
Hendrik Hildebrant
David Woods
Thomas Erben

who were instrumental in obtaining and reducing the CS82 data and producing the required galaxy catalog and mask but are not members of the BOSS collaboration.

Details are available at https://www.sdss3.org/internal/publications/cgi-bin/projects.pl/display_project/127

~~~~~~~~~~~~~
The first thing Martin wanted me to do was to approximate the errors bars for the correlation function based on the density of galaxies and quasars in my sample.

Starting with luminosity function in this paper, I am doing the following to estimate the density.

According to Table 1 in Ilbert et. al. the following are the Schechter parameters for the galaxy luminosity function, redshift 0.6-0.8:

ϕ* = 5.01e-3 h^3 Mpc^(-3) mag^(-1)
M* (i-band) = -22.17
α = -1.41

Inputing these into IDL's lf_schechter function we get the following:

mags = findgen(5*20+1)/20. - 23
vals= lf_schechter(mags, 5.01, -22.17, -1.41)
plot, mags, vals*10D^(-3), /ylog, XTITLE = 'Magnitude (iband)', YTITLE = 'log phi', TITLE = 'Luminosity Function', charsize = 2, charthick = 1


To get the density we integrate:
density = 0.05*vals*10D^(-3) #bin size is 0.05 mags
print, total(density)
0.041830996
=4.18 * 10^-2 h^3 Mpc^-3

This is similar to what they get in this paper by Faber et al:
According to Faber Paper the luminosity density:
log10(j_B) = 8.5 (@redshift 0.7) solar luminosity = 10^10 solar luminosity / galaxy

j_b = 10^8.5 solar lum = 10^(-1.5) galaxies / Mpc^3 (h = 0.7)
= 3.16*10^-2 galaxies / Mpc^3 (h = .7)
= 9.21 *10^-2 galaxies h^3 / Mpc^3

So we are looking at a density of something in the ballpark of 0.05 galaxies per (Mpc/h)^3.

To get it directly from the data:

5.5254 million galaxies (once mask/cuts applied)
sky area = 166 deg^2
redshift range = 0 - 1
redshift 1 = 2312.67 Mpc / h (in comoving distance)
volume = 166 sq-deg / (3282.90 sq-deg) * 4/3 pi r^3 = 2,619,871,820 (Mpc / h)^3

density = 2.00 *10^-3 galaxies * h^3 * Mpc^(-3)

This is an order of magnitude smaller. Not sure why.... am I perhaps doing the volume calculation incorrectly?

4257 quasars (out to redshift of 1, in the cfht footprint)
density = 1.66e-06 qsos * h^3 * Mpc^(-3)

Exchange with Martin White, RE: Estimating Errors

Martin,
I've figured out the density of the galaxies/qsos from both the LFs and the catalogs, and would like to estimate the errors in the correlation function. I understand that the errors go as 1 / sqrt(pair counts) in each bin. But going from galaxy/qso density to pair counts in a bin is where I am a bit lost. I asked Alexie, and she thought that I just take the density and multiply it by the volume of each bin in the correlation function, and then use that number to compute the pair counts. I don't see how that is the same as the pair counts for the correlation function. The correlation function is measuring separation, so how is that the same as the number of pairs in a volume with a side of the separation?


I went ahead and calculated the correlation function with the following bins (degrees):
theta = (1.0000000e-05, 3.1622777e-05, 0.00010000000, 0.00031622777, 0.0010000000, 0.0031622777, 0.010000000, 0.031622777, 0.10000000, 0.31622777, 1.0000000)

I get the following number of dd pairs in each bin:
dd = (589.000, 561.000, 49.0000, 386.000, 4675.00, 41535.0, 394556, 3.81242e+06, 3.60765e+07, 2.94166e+08)

1/sqrt(dd) = (0.0412043, 0.0422200, 0.142857, 0.0508987, 0.0146254, 0.00490674, 0.00159201, 0.000512153, 0.000166490, 5.83048e-05)

The catalogs are not properly masked, so we might get a reduction in dd values by perhaps 30% once we mask the data.

This would result in the following:

dd = (412.300 392.700 34.3000 270.200 3272.50 29074.5
276189. 2.66869e+06 2.52536e+07 2.05916e+08)

1/(sqrt(dd)) = ( 0.0492485 0.0504626 0.170747 0.0608355 0.0174808 0.00586467
0.00190282 0.000612140 0.000198993 6.96875e-05)

I'm currently running the rr, and will have a "correlation function" this afternoon. Although this will of course be wrong, because I'm not masking properly yet. I should get the masks from Alexia today or tomorrow.

Jessica

~~~~~~

Jessica,

I understand that the errors go as 1 / sqrt(pair counts) in each bin. But going from galaxy/qso density to pair counts in a bin is where I am a bit lost.

If you think of the very simplest correlation function estimator that you can write down, xi=DD/RR-1, and imagine that you have so many randoms the fluctuations in RR are negligible then you see that the errors in 1+xi are given by the fluctuations in the counts of DD in a bin. Assuming Poisson statistics, the fractional error in 1+xi goes as 1/sqrt{Npair} where Npair is the number of quasar-galaxy pairs.

For a 3D correlation function the number of data pairs goes as Nqso times Nbar-galaxy times 1+xi times the volume of the bin (in 3D, e.g. 4\pi s^2 ds for a spherical shell). Just think of what the code does: sit on each quasar and count all the galaxies in the bin. To go from a 3D correlation function to a 2D correlation function you need to integrate in the Z direction. But remember that the sum of independent Poisson distributions is also a Poisson with a mean equal to the sum of the means of the contributing parts. So this allows you to figure out what the error on wp is. You should see that as you integrate to very large line-of-sight distance things become noisier. So choose something like +/-50Mpc/h for the width in line-of-sight distance to integrate over in defining wp.

It's a little easier to understand if you write the defining equations out for yourself on a piece of paper.

Martin

~~~~~~~
Project Reading list
http://arxiv.org/abs/0802.2105
https://trac.sdss3.org/wiki/BOSS/quasars/black_hole_masses
http://iopscience.iop.org/0004-637X/665/1/265/pdf/62903.web.pdf
http://articles.adsabs.harvard.edu//full/1989ApJ...343....1D/0000011.000.html
http://adsabs.harvard.edu/abs/2005A%26A...439..863I

Sunday, March 13, 2011

Unique Quasars

One of the comments from Hennawi about my paper was he wanted to know how many unique quasars were simulated in the QSO catalog.

I wasn't exactly sure how to define "unique" quasar.

The simulated i-band fluxes range from 15 to 22.5. They are simulated to 4 decimal places.
The simulated redshifts (z) range from 0.5 to 5.0. They are simulated out to 6 decimal places.

If I define unique to be the same out to the maximum decimal places provided, then there are only 2 simulated quasars that are repeats of each other (0.00002%)

If I define unique to be within 0.1% magnitude/redshift of each other, then there are 6285 (out of 10 million) are repeats (0.06%).

If I define unique to be within 1% magnitude/redshift of each other, then there are 62,824 repeat objects (0.6%).

Anyway, so I guess I should ask Joe how he wants me to define repeats.

Here is the code to do this:
file = '/clusterfs/riemann/raid001/jessica/boss/Likeli_QSO.fits.gz'
qsos = mrdfits(file, 1)

imean = mean(qsos.i_sim)
zmean = mean(qsos.z_sim)

percent = 1.0/100. #simulated quasars within this percentage of each other
ierr = imean*percent
zerr = zmean*percent

imult = round(1./ierr)*1.
zmult = round(1./zerr)*1.

roundqsosz = round(qsos.z_sim*zmult)/zmult
roundqsosi = round(qsos.i_sim*zmult)/imult

uniqz = uniq(roundqsosz)
uniqi = uniq(roundqsosi)

all = where(qsos.z_sim GT -99999)

repeatZ = setdifference(all, uniqz)
repeatI = setdifference(all, uniqI)

help, setintersection(repeatZ, repeatI)

Also can be found in the following log file:
../logs/110313log.pro

Thursday, May 20, 2010

New Threshold Results

I've calculated the (new) thresholds using the method outlined in this post.

Below are the lratio new thresholds for the old version of the likelihood (v1) and the likelihood with ALL the Quasars as a function of targets per square degree (TPSD):

TPSD ----- v1 threshold - all quasar threshold
10.0000 -----0.749085 --------- 0.770306
20.0000 -----0.490932 --------- 0.571854
30.0000 -----0.347334 --------- 0.425242
40.0000 -----0.253020 --------- 0.329808
50.0000 -----0.198630 --------- 0.264506
60.0000 -----0.160074 --------- 0.214092
70.0000 -----0.128912 --------- 0.182231
80.0000 -----0.107913 --------- 0.152691

Here's a plot
Cyan is the ALL the Quasars
Green is likelihood v1


Using these thresholds here are the number of quasars targeted:
TPSD ---- #QSOs (v1) --- #QSOs (ALL Quasars)
10.0000
---443.000 -------- 406.000
20.0000
---656.000 -------- 582.000
30.0000
---781.000 -------- 722.000
40.0000
---897.000 -------- 830.000
50.0000
---965.000 -------- 892.000
60.0000
---1035.00 -------- 955.000
70.0000
---1089.00 -------- 1000.00
80.0000
---1123.00 -------- 1037.00

Here's a plot
Cyan is the ALL the Quasars
Green is likelihood v1

Looks like ALL the quasars doesn't do better after all with the non-Milky Way thresholds. Boo.
The log file to run this code is here: ../logs100520log.pro

Monday, May 17, 2010

ALL the Quasars Results

I've calculated the thresholds using the method outlined in this post.

Below are the lratio thresholds for the old version of the likelihood (v1) and the likelihood with ALL the Quasars as a function of targets per square degree (TPSD):

TPSD ----- v1 threshold - all quasar threshold
10.0000 ----- 0.996883 ----- 0.984691
20.0000 ----- 0.868663 ----- 0.795943
30.0000 ----- 0.674674 ----- 0.629976
40.0000 ----- 0.496851 ----- 0.522806
50.0000 ----- 0.403819 ----- 0.441149
60.0000 ----- 0.318171 ----- 0.363089
70.0000 ----- 0.262071 ----- 0.310237
80.0000 ----- 0.215056 ----- 0.268664

Here's a plot
Cyan is the ALL the Quasars
Green is likelihood v1


Using these thresholds here are the number of quasars targeted:
TPSD ---- #QSOs (v1) --- #QSOs (ALL quasars)
10.0000 -- 132.000 ----- 155.000
20.0000 -- 343.000 ----- 374.000
30.0000 -- 484.000 ----- 527.000
40.0000 -- 651.000 ----- 638.000
50.0000 -- 731.000 ----- 713.000
60.0000 -- 815.000 ----- 793.000
70.0000 -- 888.000 ----- 853.000
80.0000 -- 944.000 ----- 889.000

Here's a plot
Cyan is the ALL the Quasars
Green is likelihood v1


Next, I'll look at the difference between these QSO populations, but first I want Adam/Erin to confirm that I am doing things correctly.

The code to do this is in the following log file: ../logs/100517log.pro

Wednesday, May 12, 2010

ALL the Quasars

Recently Dinosaur Comics has been emphasizing ALL of everything. ALL the secrets, ALL the balls, ALL the burgers. So today we are going to creating the QSO catalog with ALL the Quasars.

Before we had a brightness limit and some other cuts, but Hennawi wanted me to see what it looked like with all of them.

I did this by modifying: qso_fake_jess.pro with the following call to qso_photosamp:
qsos = qso_photosamp(Z_MIN = Z_MIN, Z_MAX = Z_MAX, NOCUTS = NOCUTS, ILIM = 23.0)

Here is what these QSOs look like in color-color space:


And re-running hiz_kde_numerator_jess.pro:
.run hiz_kde_numerator_jess.pro

To create a new QSO Catalog:
../likelihood/qsocatalog/QSOCatalog-Wed-May-12-14:03:41-2010.fits

Here is what the QSO Catalog looks like in color-color space:



Change likelihood_compute to read in this catalog:
; Read in the QSO file
file2 = '/home/jessica/repository/ccpzcalib/Jessica/likelihood/qsocatalog/QSOCatalog-Wed-May-12-14:03:41-2010.fits'

This run is in the directory:
../likelihood/likev1all

The log file is here:
../logs/100512log.pro

~~~~~~~~

On a side note, I looked at Google Analytics for this blog today. It turns out that over it's lifetime 355 Absolute Unique vistors have viewed my blog! These visitors are from 10 countries and 28 different states. Pretty cool! Yet very few people leave comments on my blog. Everyone reading
this, leave a comment so I know you are out there!

~~~~~~~~

Is anyone else having problem's editing their blogspot blog using Google Chrome? It totally messes up the fonts and when I look at the html it is all funky. Seem strange considering google owns both Chrome and blogsplot! Maybe it is their way of punishing Mac users for the iPhone being better than the Google phone.

Tuesday, May 11, 2010

Likelihood Threshold Results

I've calculated the thresholds using the method outlined in yesterday's post. Calculating the likelihoods on the 43,134 targets in this 12 deg2 region took ~30 minutes (so that is quick).

Below are the thresholds for the old version of the likelihood (v1) in parentheses are the thresholds Adam Myers got:

Targets/deg2---------Threshold
10----------------0.995851
20----------------0.858535 (0.533)

30----------------0.664424

40----------------0.490932 (0.235)

50----------------0.400611

60----------------0.311577
70----------------0.256615

80----------------0.213374


So my thresholds still don't match his.

Here is a plot of deg2 vs threshold:



In terms of recovered QSOs with these thresholds:
Targets/deg2---Threshold-----Recovered QSOs (out of 1625)
10.0000 --- 0.995851 ------- 135 (8.3%)
20.0000
--- 0.858535 ------- 346 (21.3%)
30.0000
--- 0.664424 ------- 489 (30.1%)
40.0000
--- 0.490932 ------- 651 (40.1%)
50.0000
--- 0.400611 ------- 727 (44.7%)
60.0000
--- 0.311577 ------- 816 (50.2%)
70.0000
--- 0.256615 ------- 886 (54.5%)
80.0000
--- 0.213374 ------- 942 (58.0%)


I've saved these likelihoods in the directory: ../likelihood/likev1/
The they are also here:
likelihoodv1thresholds.fits
qsolikelihoodv1thresholds.fits
The log file is here: ../logs/100511log.pro

Now to make some magic happen...

Tuesday, April 27, 2010

Likelihood Test Results (4)

Now I am going to see how the likelihood compares if we use the Richards luminosity function and we also include the (0.5 < z < 5.0). The code to run this is in the following log file: ../log/100427_3log.pro. This works even better than the last test:

eps = 1e-30
bossqsolike = total(likelihood.L_QSO_Z[16:32],1) ;quasars between redshift 2.1 and 3.7
qsolcut1 = where( alog10(total(likelihood.L_QSO_Z[16:32],1)) LT -9.0)
den = total(targets.L_EVERYTHING_ARRAY[0:4],1) + total(likelihood.L_QSO_Z[0:44],1) + eps
num = bossqsolike + eps
NEWRATIO = num/den
NEWRATIO[qsolcut1] = 0 ; eliminate objects with low L_QSO value


IDL> print, n_elements(ql) ; number quasars
461
IDL> print, n_elements(ql)*1.0/(n_elements(sl)+n_elements(ql)) ;percent accuracy
0.256111
IDL> print, n_elements(ql) + n_elements(sl) ; total targeted
1800
IDL>
IDL> print, n_elements(nql) ; number quasars
575
IDL> print, n_elements(nql)*1.0/(n_elements(nsl)+n_elements(nql)) ;percent accuracy
0.319444
IDL> print, n_elements(nql) + n_elements(nsl) ; total targeted

We now go from 25.6% accuracy to 31.9% accuracy! And we find 114 more quasars!

Plots
The white points were targeted/missed by both the new and old likelihoods
The magenta points were only targeted/missed by the new likelihood
The cyan points were only targeted/missed by the old likelihood



Monday, April 26, 2010

Likelihood Test Results (3)

As a sanity check I re-computed the likelihoods with the old qso catalog to make sure I got the same results as before, which I did. See ../logs/100426log.pro for code.

It seems that adding in the BOSS QSOs doesn't help.

Now I am going to run on the old qso catalog but with the larger redshift range (0.5 < z < 5.0) and see if this improves things. See ../logs/100426log.pro for code.

So this seems to actually make a difference. I have played around with the redshift range for the numerator to get the best selection numbers:

eps = 1e-30
bossqsolike = total(likelihood.L_QSO_Z[14:34],1) ;quasars between redshift 2.1 and 3.5
qsolcut1 = where( alog10(total(likelihood.L_QSO_Z[19:34],1)) LT -9.0)
den = total(targets.L_EVERYTHING_ARRAY[0:4],1) + total(likelihood.L_QSO_Z[0:44],1) + eps
num = bossqsolike + eps
NEWRATIO = num/den
NEWRATIO[qsolcut1] = 0 ; eliminate objects with low L_QSO value


qsolcut2 = where( alog10(total(targets.L_QSO_Z[0:18],1)) LT -9.0)
L_QSO = total(targets.L_QSO_Z[2:18],1)
den = total(targets.L_EVERYTHING_ARRAY[0:4],1) + total(targets.L_QSO_Z[0:18],1) + eps
num = L_QSO + eps
OLDRATIO = num/den
OLDRATIO[qsolcut2] = 0 ; eliminate objects with low L_QSO value


numsel = 1800-1
NRend = n_elements(newratio)-1
sortNR = reverse(sort(newratio))
targetNR = sortNR[0:numsel]
restNR = sortNR[numsel+1:NRend]

ORend = n_elements(oldratio)-1
sortOR = reverse(sort(oldratio))
targetOR = sortOR[0:numsel]
restOR = sortOR[numsel+1:ORend]

nql = setintersection(quasarindex,targetNR)
nqnl = setintersection(quasarindex, restNR)
nsl = setdifference(targetNR, quasarindex)
nsnl = setdifference(restNR, quasarindex)

ql = setintersection(quasarindex, targetOR)
qnl = setintersection(quasarindex, restOR)
sl = setdifference(targetOR,quasarindex)
snl = setdifference(restOR,quasarindex)

IDL> print, n_elements(ql) ; number quasars
461
IDL> print, n_elements(ql)*1.0/(n_elements(sl)+n_elements(ql)) ;percent accuracy
0.256111
IDL> print, n_elements(ql) + n_elements(sl) ; total targeted
1800
IDL>
IDL> print, n_elements(nql) ; number quasars
508
IDL> print, n_elements(nql)*1.0/(n_elements(nsl)+n_elements(nql)) ;percent accuracy
0.282222
IDL> print, n_elements(nql) + n_elements(nsl) ; total targeted
1800

So we go from 25.6% accuracy to 28.2% accuracy! And we find 47 more quasars!

Plots
The white points were targeted/missed by both the new and old likelihoods
The magenta points were only targeted/missed by the new likelihood
The cyan points were only targeted/missed by the old likelihood


The code for the above run is in the directory ../likelihood/run1/

Now to see how the different luminosity functions do. Below is results running with the Richard 06 luminosity function:
QSO Catalog with Richards Luminosity Function


(See ../logs/100427_2log.pro for code)

IDL> print, n_elements(ql) ; number quasars
461
IDL> print, n_elements(ql)*1.0/(n_elements(sl)+n_elements(ql)) ;percent accuracy
0.256111
IDL> print, n_elements(ql) + n_elements(sl) ; total targeted
1800
IDL>
IDL> print, n_elements(nql) ; number quasars
543
IDL> print, n_elements(nql)*1.0/(n_elements(nsl)+n_elements(nql)) ;percent accuracy
0.301667
IDL> print, n_elements(nql) + n_elements(nsl) ; total targeted
1800

This does better than the old luminosity function!
So we go from 25.6% accuracy to 30.2% accuracy! And we find 82 more quasars!

Plots
The white points were targeted/missed by both the new and old likelihoods
The magenta points were only targeted/missed by the new likelihood
The cyan points were only targeted/missed by the old likelihood


Sunday, April 25, 2010

Likelihood Test Results (2)

I'm going to try adding in all QSOs (regardless of brightness). I've taken out the BOSS QSOs that are in Stripe-82 so that we don't have the same objects in the Monte Carlo as in the testing data.

Joe suggested doing this. We'll see if it improves things, if not, then looks like we should go back to old catalog and see if modifying the luminosity function helps.

Here is what they look like:
BOSS QSOs


SDSS DR5 QSOs



The code to do this is in the following log file: ../logs/100425log.pro
It creates the following file to use as input to Monte Carlo: ~/boss/allQSOMCInput.fits"

Plot of allQSOMCInput.fits


I re-ran hiz_kde_numerator_jess.pro to make a new QSOCatalog, which is here:
../likelihood/qsocatalog/QSOCatalog-Mon-Apr-26-13:22:43-2010.fits

Plot of QSOCatalog-Mon-Apr-26-13:22:43-2010.fits


I changed likelihood_compute to look at the above catalog.

Run the likelihood calculation with this new input catalog:

smalltargetfile = "./smalltarget.fits"
smalltargets = mrdfits(smalltargetfile, 1)

.com splittargets2.pro

nsplit = 50L
splitsize = floor(n_elements(smalltargets)*1.0/nsplit)
junk = splitTargets2(smalltargets, nsplit, splitsize)

; run likelihood.script

.com mergeLikelihoods.pro
likelihood = mergelikelihoods(nsplit)


outfile = "./likelihoods2.0-4.0ALLBOSSAddedSmall.fits"
splog, 'Writing file ', outfile
mwrfits, likelihood, outfile, /create


IDL> print, n_elements(ql) ; number quasars
467
IDL> print, n_elements(ql)*1.0/(n_elements(sl)+n_elements(ql)) ;percent accuracy
0.259300
IDL> print, n_elements(ql) + n_elements(sl) ; total targeted
1801
IDL>
IDL> print, n_elements(nql) ; number quasars
418
IDL> print, n_elements(nql)*1.0/(n_elements(nsl)+n_elements(nql)) ;percent accuracy
0.232093
IDL> print, n_elements(nql) + n_elements(nsl) ; total targeted
1801


This doesn't do better than the old likelihood either (same color scheme as last post).



The logfile for this is ../logs/100425log.pro