Notebook part of TFM Carlos Toro Peñas


Analysis of 2dFGRS several correlation function estimators


This notebook is an extension inpired by a practical exercise shown in the Python matter over the DataSience master degree. We have choosen the 2dFGRS catalog intstead the SDSS one the LRG Sample, requires a lot of resources and APIs usage in order to load the FITS files.

Load data:

Data transformation and change units:

Generate a mock catalog to generate a Possion distribution over the space $\mathbb{R}^3$. Note that we have to take the cube of radius:

Plot the simulated catalog in space $\mathbb{R}^3$

Plot data_sample in space $\mathbb{R}^3$

Lets compare the underlying continuous distance distributions of our sample against the synthetic sample by using a Kolmogorov-Smirnov test:

Yes, our sample and mock-sample have the same distribution over the space $\mathbb{R}^3$.

Next step consists in use the theory to calculate the correlation function over different estimators: Natural, Hamilton, David and Peebels and Landy and Szalay. Note the use of KDTree(data_coords) and KDTree(mock_coords).

Take into account following considerations:

Normalization: The code uses $N(N-1)$ for $DD$ and $RR$ because we are looking at pairs within the same catalog. For $DR$, it uses $N_d \cdot N_r$ because every data point is compared against every random point.

Computational Efficiency: Using scipy.spatial.KDTree is significantly faster than a brute-force $O(N^2)$ distance matrix, especially for large datasets common in astronomy or molecular dynamics. Take a look at https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.KDTree.html

We finally plot the estimators results:

These peaks close to the 100 $h^{-1}$Mpc (1$h^{-1}$Mpc = 0.7Mpc ) are explained by the cosmological model and is an already known feature of the Cosmic Background Radiation (CBR) called Baryonic Acoustic Oscillations (BAOs).

Globally speaking, the probability to find galaxies diminsh with the distance, this is expected, but around 100 $h^{-1}$Mpc we see that density grows again until a certain point, producing a peak or "bump" of density.

Such peak have to do with the fluctuations produced in the early universe, the explanation is as follows:

In the early Universe, prior to the recombination (decoupling) epoch, baryonic matter and radiation were tightly coupled in a hot, dense plasma. While gravity acted to compress this plasma into primordial density seeds, the intense radiation pressure acted as a restoring force, pushing the baryonic matter outward.

These opposing forces created spherical acoustic pressure waves that traveled through the plasma. Crucially, Dark Matter, which does not interact with radiation, did not experience this outward pressure. Instead, it remained at the center of the original perturbation, governed only by gravity. This created a separation: a central concentration of dark matter and a spherical shell of baryonic matter moving outward at approximately 57% of the speed of light.

At approximately 380,000 years after the Big Bang (the decoupling era), the Universe cooled sufficiently for neutral atoms to form. At this point, the photons escaped (forming the Cosmic Microwave Background), and the sound wave was suddenly "frozen" in place. This left a characteristic shell of baryonic matter at a fixed distance—the sound horizon—relative to the central dark matter peak.

Over billions of years, gravity pulled more matter into both the central peak and the spherical shell. As the Universe expanded, this "fingerprint" persisted in the large-scale structure. Today, this manifests as a statistically significant excess of galaxy pairs separated by approximately $150 \text{ Mpc}$ (or $100 \text{ } h^{-1}\text{Mpc}$), appearing as the BAO peak in our 2PCF analysis.