PaintMyChromosomes.com
fineSTRUCTURE v2 & GLOBETROTTER

Finestructure Icon
© 2012 Daniel Lawson.
Website template by Arcsin

Summary of tools for data preparation

Main tools (links to more information and download):


  • fineSTRUCTURE, which assigns individuals to populations based on the chromocombine output (using MCMC or stochastic optimization). This includes a cut-down version of ChromoPainter ChromoCombine, providing you with everything that you need for normal use in one package.
  • GLOBETROTTER, which infers admixture dates. It requires a complete version of ChromoPainter, included in the main download page.

Utility tools provided here:


The manual lists the software that is included in fineSTRUCTURE. The tools below are mostly not required any more, but some may provide some additional functionality that might be helpful.

  • qsub_run.sh, the job submitter included in finestructure. This allows jobs to be submitted to qsub and similar systems, exploiting GNU parallel to run a large number of single-threaded jobs in parallel on HPC machines without overloading the queuing system.
  • memorycap, which allows you to monitor and cap the memory used by a process. This is extremely helpful for managing runs on an institutional cluster when there are both a large number of SNPs and individuals, for which ChromoPainter reserves a lot of memory.
  • makeuniformrecfile.pl, which creates a uniform recombination file for use with the linkage model of chromopainter.
  • neaverage.pl, which computes the average value of the effective population size when using chromopainter in EM mode for parameter estimation.
  • plink2chromopainter.pl, a conversion script for going from PLINK style PED and MAP files to ChromoPainter's PHASE and MAP files.
  • impute2chromopainter.pl, a conversion script for going from IMPUTE2 phased format (.haps files, this includes SHAPEIT) to ChromoPainter's PHASE and MAP files.
  • chromopainter2impute2.pl, a conversion script for going from PHASE format to IMPUTE2 and SHAPEIT ".haps" files.
  • transpose.pl, a tool to rotate matrices, for example if you have prepared your files in excel you might have them transposed compared to that required here.
  • chromopainterindivrename.pl, a tool to add individual names into chromopainter output if you did not set this up correctly beforehand.
  • phasesubsample.pl, to extract subsets of a phase file (e.g. to test code, or perform EM estimation on small datasets)
  • phasescreen.pl, to remove non-varying or singletons from a PHASE file.
  • ped2ippca.pl, to convert to ippca's csv format.
  • FineSTRUCTURE R tools, for advanced plotting, using continent force files, and creating PCA plots from known populations.
  • msms2cp.pl, to convert msms and ms format to ChromoPainters phase format.
  • FineSTRUCTURE R tools, for advanced plotting, using continent force files, and creating PCA plots from known populations.
  • hap2dip.pl, to convert a haploid chromopainter matrix (finestructure input matrix) into a diploid, optionally adding names.

Other software that will likely be useful:


  • IMPUTE2, which both PHASES data, as well as IMPUTING MISSING SNPS. Both of these stages are necessary for the best use of our software, and the imputing stage is always necessary if your data contain missing values. This is a convenient choice of phasing software since we've provided a conversion script (impute2chromopainter.pl).
  • PLINK, a popular software suite for manipulating genetics data. Although this is not the same file format as used by ChromoPainter, we have provided plink2chromopainter.pl which converts between the two (for both the UNLINKED and LINKED cases).
  • The FCgene format converter can convert between many common file formats, including PLINK, which can be converted into PHASE format for ChromoPainter using the plink2chromopainter.pl script (see above).