Jessica Kirkpatrick Research: code

Showing posts with label code. Show all posts

Thursday, September 15, 2011

I hate IDL

I've been working a little bit in IDL for this new project because all the data is in fits files, Alexie has a lot of code that is in IDL which we are using to handle the CFHT data.

Anyway, I spent all morning struggling trying to do very simple input and output from this stupid stupid language.

I learned a few things, and since I hardly ever code in IDL, I am sure I'll forget unless I write them down right now. So here we go:

1) Tag names

You can get the tag names from a structure like this:

tname = tag_names(struct)

2) Merging structures

You can merge two structures as follows:

str3 = create_struct(str1,str2,NAME='str3')

3) writecol bug

For some reason when I try to use writecol to write out data, if there are more that 6 columns, and I don't specify the format, it puts a return carriage in the middle of the data. This goes away if you specify the format:

thisfile = './qsosMergedInfo.dat'
writecol,thisfile, qsos.ra, qsos.dec, qsos.z, qsos.zwarning, $
qsos.flux_clip_mean[0],qsos.flux_clip_mean[1], qsos.flux_clip_mean[2], $
qsos.flux_clip_mean[3], qsos.flux_clip_mean[4], fmt='(f,f,f,f,f,f,f,f,f)'

Wednesday, September 7, 2011

More streamlining

Martin White keeps telling me I should streamline my code/runs so that it is really easy to re-run with new data/subsets.

I've been working on doing that for running the correlation functions, and running the error analysis. I've written code:

../Jessica/qsobias/Correlate/runCrossCorrelationJackknifeMulti.py

That takes the run directories, and qso-file names as inputs (along with other constants for the correlation) and runs the correlation on all those different files at once.

Then there is code that compares the correlation functions and makes a bunch of pretty plots. I'm running it for the first time now. Hope it all works!

Oh, and I need to talk to Martin White about what to do when certain realizations of the bootstrap/jackknife are negative. This causes problems with fitting the power-law, and also isn't physical. Currently I re-run the realization if it has a negative value for the correlation function, but I don't think this the right thing to do, as it is biasing the bootstrap to have higher clustering values.

Tuesday, June 14, 2011

Temperature Plots in Python

I figured out how to do temperature plots in python. The only problem with below is that the numbers on the axis are wrong. I if I set 'extent' to be the actual extent of the axis (in this case x = [0.5,1], y = [-28,-18]) then it plots a very skinny, tall plot. So I had to correct the axis by hand later.

import matplotlib
import numpy as np
import matplotlib.cm as cm
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt

redbins = linspace(start = 0.5, stop = 1.0, num = 51)
rcenter = (redbins[0:-1]+redbins[1:])/2.0
mbins = linspace(start = -28, stop = -18, num = 51)
mcenter = (mbins[0:-1]+mbins[1:])/2.0

H, xedges,yedges = N.histogram2d(qsoM[dr7cut],qsored[dr7cut],bins=(mbins,redbins))
X = rcenter
Y = mcenter
Z = H

#plt.imshow(H, extent=[-3,3,-3,3], interpolation='bilinear', origin='lower')
#plt.colorbar()

plt.imshow(H, extent=[-3,3,-3,3], interpolation='nearest', origin='lower')
plt.colorbar()

xlabel('redshift (z)')
ylabel('absolute magnitude (i-band)')
title('SDSS DR7 Quasar Density')

The above code is also in the following log file: ../logs/110614log.py

Monday, June 6, 2011

Testing QSO-Galaxy Cross CF

I spent the last couple days making sure that I trusted the correlation function code I had developed for this project. I compared it to the results from a (slow) sphere-match code on a small data set.

The following are the testing codes (also here ../logs/cross_corr_check.idl). In IDL using spherematch:

; Read in data/random files

readcol,'cfht_data_tiny.dat',rad1,decd1, format = ('D,D')

readcol, 'qso_data_tiny.dat',rad2,decd2,cd, format = ('D,D,D') ;cd is comoving distance of qsos

readcol,'cfht_randoms_tiny.dat',rar1,decr1, format = ('D,D')

; Find number of objects in files

nr1 = n_elements(rar1)

nd1 = n_elements(rad1)

nd2 = n_elements(rad2)

; Correlate out to a large theta to get all pairs

thetamax = 50.0

print,'Starting cross-correlation:'

print,'Estimating DD...'

spherematch,rad1,decd1,rad2,decd2,thetamax,ind1a,ind1b,dist_dd,maxmatch=0

; Convert from angular separation to comoving distance separation

this_dd = 2*sin(dist_dd*!pi/360)*cd[ind1b]

;Bins go from 0.1 to 10 with 15 bins.

corrmin = 0.1D

corrmax = 10.0D

nbins = 15.0D

; Find bins lower, upper and centers

bins_lower = (corrmax-corrmin)/(nbins)*findgen(nbins)+corrmin

bins_upper = (corrmax-corrmin)/(nbins)*(findgen(nbins)+1)+corrmin

rmean = fltarr(nbins)

for i = 0,(nbins-1) do rmean[i] = (bins_lower[i]+bins_upper[i])/2.

; Bin the DD separation distances

dd = fltarr(nbins)

for i = 0,(nbins-1) do dd[i] = n_elements(where(this_dd gt bins_lower[i] AND this_dd le bins_upper[i]))

print,'Estimating DR...'

spherematch,rar1,decr1,rad2,decd2,thetamax,ind1,ind2,dist_dr1,maxmatch=0

this_dr = 2*sin(dist_dr1*!pi/360)*cd[ind2]

dr = fltarr(nbins)

for i = 0,(nbins-1) do dr[i] = n_elements(where(this_dr ge bins_lower[i] AND this_dr le bins_upper[i]))

corr1 = 1L*dd/dr*1L*(nd2*nr1)/(1L*nd1*nd2)-1L

for i = 0,(nbins-1) do print, rmean[i], corr1[i]

Separation omega

0.430000 -0.115686

1.09000 -0.104478

1.75000 -0.120804

2.41000 -0.0914845

3.07000 -0.0393971

3.73000 -0.0268416

4.39000 0.0134841

5.05000 0.0596094

5.71000 0.0227162

6.37000 0.102554

7.03000 0.0929233

7.69000 0.0900670

8.35000 0.0591398

9.01000 0.0284724

9.67000 0.0598689

(Note that these "tiny" files only have 6 qsos and 9000 galaxies, so the correlation function values are very noisy, this was just to test and I used small files to)

By comparison I also have the python/C code which runs much faster (../Jessica/qsobias/Correlate/runCorrelation.py):

import numpy as N

from pylab import *

from correlationFunctions import *

#------------------------------------------------------------------------

# Create file names (tiny catalogs)

#------------------------------------------------------------------------

workingDir = 'tinyrun'

makeworkingdir(workingDir)

galaxyDataFile, qsoDataFile, randomDataFile, corr2dCodefile, argumentFile, runConstantsFile = makeFileNamesTiny(workingDir)

oversample = 5. # Amount that randoms should be oversampled

corrBins = 25.0 # Number of correlation bins (+1)

mincorr = 0.1 # (Mpc/h comoving distance separation) Must be great than zero if log-binning

maxcorr = 10.0 # (Mphc/h comoving distance separation)

convo = 180./pi # conversion from degrees to radians

tlogbin = 1 # = 0 for uniform spacing, = 1 for log spacing in theta

#------------------------------------------------------------------------

# Write run constants to a file

#------------------------------------------------------------------------

writeRunConstantsToFile(runConstantsFile, galaxyDataFile, qsoDataFile, \

randomDataFile, corr2dCodefile, argumentFile, oversample, corrBins, \

mincorr, maxcorr, tlogbin)

#------------------------------------------------------------------------

# Compute the Angular Correlation Function

#------------------------------------------------------------------------

runcrossCorrelation(workingDir, argumentFile, corr2dCodefile, galaxyDataFile,\

qsoDataFile, randomDataFile, mincorr, maxcorr, corrBins, tlogbin)

# separation (Mpc/h) crossw (Mpc/h)

0.4300000000 -0.1156862745

1.0900000000 -0.1044776119

1.7500000000 -0.1208039566

2.4100000000 -0.0914845135

3.0700000000 -0.0393970538

3.7300000000 -0.0268417043

4.3900000000 0.0134841235

5.0500000000 0.0596093513

5.7100000000 0.0227161938

6.3700000000 0.1025539385

7.0300000000 0.0929232804

7.6900000000 0.0900670231

8.3500000000 0.0591397849

9.0100000000 0.0284723490

9.6700000000 0.0598689436

As you can see the correlation functions match!

Friday, December 10, 2010

Using the Whole Footprint with Masks

For the past few days I've been working on using masks to run correlation functions on the whole BOSS footprint. It's required learning how to use polygon masks and also to create randoms in these masks. Here is the code/testing:

To mask BOSS data (spectroscopic set)

;Read in BOSS data
bossgalaxies = '/home/jessica/boss/spAll-v5_4_14.fits'
spall = mrdfits(bossgalaxies, 1)

;Trim BOSS data to be a galaxy (remove QSOs, starts, etc)
isgal = where(spall.specprimary EQ 1 $
AND (spall.boss_target1 AND 2L^0+2L^1+2L^2+2L^3+2L^7) NE 0 $
AND strmatch(spall.class,'GALAXY*') $
AND spall.zwarning EQ 0)

;Select galaxies
galaxies = spall[isgal]
ra = galaxies.ra
dec = galaxies.dec
z = galaxies.z

;Select objects in the masks
weights = maskdataboth(ra,dec,z)

; Readout ra, dec, redshift of spectroscopic objects inside mask
thisfile = '/home/jessica/boss/spectroRaDecZMasked.dat'
writecol,thisfile,ra,dec,z

The code for maskdataboth is here:
/home/jessica/repository/ccpzcalib/Jessica/pythonJess/maskdataboth.pro

'maskdataboth' filters the data through both the BOSS mask and Shirley's mask to make sure the data sets are in the same ra/dec footprint.

To Mask Shirley's Data (photometric set)

;Read in Shirley's Photometric Data
file1 = "/home/shirleyho/research/SDSS_PK/powerspec_dir/DATA/ra_dec_z_run_rerun_camcol.z_gt_0.01"
readcol, file1, pra, pdec, pz, RUN, rerun, camcol, x, x, x, x, x, x, x, x, format='(F,F,F,D,D,D,D,D,F,F,F,F,F,F)'

;Select objects in the masks
pweights = maskdataboth(pra,pdec,pz)

; Readout ra, dec, redshift of spectroscopic objects inside mask
thisfile = '/home/jessica/boss/photoRaDecZMasked.dat'
writecol,thisfile,pra,pdec,pz

Here are plots of the data sets to show that they do indeed fall in the same region of the sky:

Both these data sets are saved in the following directory: /home/jessica/boss/
The code to run this is here: ../logs/101210log.pro

Thursday, December 9, 2010

maskdataboth.pro

Monday, October 25, 2010

Code for Random Catalog with Mask

From Wolfram Math World:

To pick a random point on the surface of a sphere, it is incorrect to select spherical coordinates θ and φ from uniform distributions θ ∈ [0,2π) and φ ∈ [0,π], since the area element dΩ=sinφdθdφ is a function of φ, and hence points picked in this way will be "bunched" near the poles (left figure above).

To obtain points such that any small area on the sphere is expected to contain the same number of points (right figure above), choose u and v to be random variates on (0,1). Then

θ = 2πu

φ = cos^-1(2v - 1)

gives the spherical coordinates for a set of points which are uniformly distributed over the surface of the sphere. This works since the differential element of solid angle is given by

dΩ = sinφdθdφ = -dθd(cosφ)

So for us to pick a random ra and dec on the sphere we want to use the following in idl:

theta = 2*!pi*RANDOMU(S, 1)
phi = acos(2*RANDOMU(S, 1) - 1)

This should give us a random theta and phi where θ ∈ [0,2π) and φ ∈ [0,π].

Unfortunately, just to be difficult HEALPix expects the following as inputs for ang2pix_nest:

theta - angle (along meridian), in [0,Pi], theta=0 : north pole
phi - angle (along parallel), in [0,2*Pi]

So their convention is the opposite of mine above so we should use the following:
phi = 2*!pi*RANDOMU(S, 1)
theta = acos(2*RANDOMU(S, 1) - 1)

Here is the code:

file = "maskfile.fits"
; nside_in is the number of pixels n the mask
read_fits_map, file, weight, nside=nside_in, ordering=order

; make random phi and theta
phi = 2*!pi*RANDOMU(S, 1)
theta = acos(2*RANDOMU(S, 1) - 1)

; Find pixel ipnest
ang2pix_nest, nside_in, theta, phi, ipnest

; Find weight of pixel
thisweight = weight[ipnest]

; check to make sure pixel center (thistheta, thisphi) is close to phi and theta
pix2ang_nest, nside_in, ipnest, thistheta, thisphi

(Checked this for several theta/phi and it looked good)

So here is the code to make the random catalog of size randomsize:


randomsize = 100
randomtheta = findgen(randomsize)*0.0
randomphi = findgen(randomsize)*0
index = 0

while index lt randomsize do begin
    thisindex = index 
    while index eq thisindex do begin
       phi = 2*!pi*RANDOMU(S, 1)
       theta = acos(2*RANDOMU(S, 1) - 1)
       ang2pix_nest, nside_in, theta, phi, ipnest
       thisweight = weight[ipnest]
       if thisweight gt 2.0 then begin
          randomtheta[index] = theta
          randomphi[index] = phi
          index = index+1
       endif  
    endwhile
endwhile

Thursday, March 11, 2010

Changing Luminosity Function

I've figured out how to change the Joe Hennawi's luminosity function code. There is a log file which has the actual idl code (100309log.pro). I will also outline it here.

Change QSO_LF_SHEN such that the proper luminosity function is used. This is done in the following section (change J06 and COMBO17-PDE):

IF z LE 3.5d THEN data = 'J06' $
ELSE IF z GT 3.5d AND z LE 5.5d THEN data = 'COMBO17-PDE'
result = QSO_LF(z, imag_normz, data = data)
Change hiz_qso_dndz so that it writes to the correct file (change outputfile):

, textout = './outputfile.dat'
Run hiz_qso_dndz make quasar luminosity function file:

.run hiz_qso_dndz
Check that the produced luminosity function looks reasonable by plotting:

rdfloat,'./outfile.dat',z,dndz,skip=1
pz=(dndz-shift(dndz,1))/(z-shift(z,1))
plot, z, pz
Change hiz_kde_numerator so that it uses the luminosity output file generated in hiz_qso_dndz and that creates the QSO Catalog in the desired place:

dNdz_file = './output.dat'
sim_file = '~/QSOCatalog.fits'
Run hiz_kde_numberator to generate the QSO Catalog:

.run hiz_kde_numerator.pro
Look at QSO Catalogs by plotting in IDL.

The above plot shows several luminosity functions.

Green is Richards 06, cyan is Jiang, purple is Jiang Combined (what we used in previous QSO Catalog, white is a Richard/Jiang average (suggested by Myers).

The magenta points are low redshift quasars (z less than 2.15)
The green points are high redshift quasars (z greater than 2.15)

We still seem to be missing quasars around u-g = 0.8. I want to add in some quasars in this color space into the inputs to the monte carlo to isolate if it is a problem with the luminosity function or a problem with our inputs. Do you have thoughts Joe?

Wednesday, January 20, 2010

Struggling with Eclipse

I use Eclipse to interface with the SVN repository where Alexia and I share our code. At some point in the last month the repository location at Princeton was moved, and this caused Eclipse to be unhappy. I should be able to change the location of the repository in the following way:

Go to "SVN Repositories" perspective, right click on current repository and "relocate" to new location.

This failed.

So, I had to re-download it manually, transfer files that were newer on my system to the new repository and then update manually, then add the new repository to Eclipse and delete the old one. Annoying!

However, I did discover that there is a way to ignore certain types of files from being checked in using Eclipse. You do this by going to "Pydev Package Explorer" and then right-clicking on a file, and then select "Team" and then "Add to SVN Ignore." This is helpful to not automatically load the large .dat files to the repository.

Now that I have everything working again, time to explore the new crl.py file....

Jessica Kirkpatrick Research

Thursday, September 15, 2011

I hate IDL

Wednesday, September 7, 2011

More streamlining

Tuesday, June 14, 2011

Temperature Plots in Python

Monday, June 6, 2011

Testing QSO-Galaxy Cross CF

Friday, December 10, 2010

Using the Whole Footprint with Masks

Thursday, December 9, 2010

maskdataboth.pro

Monday, October 25, 2010

Code for Random Catalog with Mask

Thursday, March 11, 2010

Changing Luminosity Function

Wednesday, January 20, 2010

Struggling with Eclipse

The Concept

The Rules

Blog Archive

Collaborators

Relevant References

About Me

Conditions of Use