Suitable Data

Finestructure Icon

Publicly available datasets

  • The HGDP data is a good place to start. You can download pre-phased data in PHASE output format, which is in a form very close to that required by ChromoPainter.
  • The human recombination map HapMap B37 data obtained from nih can be processed with "convertrecfile.pl -M hapmap". See tools (or direct download the tool convertrecfile.pl).
  • Our Simulated data is available for download. This is for use with the Complex Example.
  • Our HGDP Coancestry (i.e. chunk counts) matrix as described in the main paper is available, in case it is of use to anyone. In addition, you can download the HGDP Population results, as an R object. This contains the coancestry matrix ("chunkcounts"), the list of populations ("poplist"), the populaiton-wise average ("avemat") and SD ("sdmat") matrices, and the tree ("hgdpdend"). This should be used with the R library. The order of the tree, populations and population-level matrices are the same. The individuals are ordered using the HGDP ordering (trivially processed from the HGDP page which has one line per haplotype rather than per individual), with individual POPi being the i-th entry of that pop; for example, HGDP00791 is Japanese27.

  • Sampling your own data

    This section is under preparation, but see the advice of phasing.
  • From above, note that the human recombination map HapMap B37 data obtained from nih can be processed with "convertrecfile.pl -M hapmap". See tools (or direct download the tool convertrecfile.pl).