PaintMyChromosomes.com
|
fineSTRUCTUREDataMethodologyOtherAuthors
©
2012 Daniel Lawson.
Website template by Arcsin |
11 Provided scriptsThese are provided in the ‘scripts’ directory. You will need to add this directory to your path, copy them to somewhere in your path, or specify absolute file locations. The usual caveats should be followed; we try to make these scripts work, but if they don’t then we aren’t responsible! Try to fix the problem yourself and let the author know of the issue.
11.1 makeuniformrecfile.plCreates a recombination rate map for use with the linkage model of chromopainter. This is essential if you do not have a provided recombination map for your species! The map assumes a constant rate of recombination per base, not per SNP. Usage ./makeuniformrecfile.pl <phasefile> <outputfile> <phasefile> is a valid chromopainter inputfile ending in .phase (in ChromoPainter v1 or v2 format) <outputfile> will be a recombination file usable with <phasefile> in ChromoPainter, nominally in Morgans/base. The recombination rate is scaled to be approximately that in humans (0.1 Morgans/Mb). Because of this, it will NOT be usable directly and should only ever be used in conjunction with EM parameter estimation, which corrects for the global amount of recombination. If you are working on non-humans or simulated data, you may experience problems with EM estimation. The parameter may get stuck at a local mode where there is effectively infinite (or no) recombination. In this case, you should specify the initial conditions of ChromoPainter to have a much smaller or larger Ne (-n) value.
11.2 convertrecfile.plConverts between recombination map files, and can take a wide varienty of map formats and convert them into a suitable format for ChromoPainter. For example, the HapMap B37 data obtained from nih can be processed with ”convertrecfile.pl -M hapmap”, but any CDF or PDF style format is supported. -----convertrecfile.pl, create recombination maps for phase files from other maps. Copyright (C) 2014 Daniel Lawson (dan.lawson@bristol.ac.uk) licenced under GPLv3 This is free software with NO WARRANTY, you are free to distribute and modify; see http://www.gnu.org/licenses Usage: ./convertrecfile.pl <MAJOR MODE> <options> phasefile inrecfile outputrecfile phasefile is a valid chromopainter or chromopainter v2 inputfile ending in .phase inrecfile is a recombination file specified in one of the formats specified in <mode> outputrecfile will be a valid recombination file for use with ChromoPainter. MAJOR MODES: specified with -M. (Shortest unambiguous mode option will work) -M: <val>: Specify the major mode. <val> can be: hapmap: The hapmap format is specified as 4 columns: chromosome, Position(BP) Rate(cM/Mb) Map(cM) This uses columns 2 and 4 to reconstruct the map. plain: (default) Assumes that the data are specified in 2 columns, Position(BP) Rate(M/b) This is the mode assumed chromopainter (note: the rate is Morgans per base). Other important options: -v: Verbose mode. -h: This help. -H: Detailed help on the wide variety of different options, including different column separators, different units, reading of Culmulative vs non-culmulative distributions, and handling maps that do not cover the full range of the SNPs. EXAMPLE: ./convertrecfile.pl -M hap my_chr1.phase genetic_map_GRCh37_chr1.txt my_chr1.recombfile
11.3 chromopainter2chromopainterv2.plConvert old phase format datasets to the new format. Not that this is not strictly necessary, because fineSTRUCTURE can use either type, but the new format is much more sensible. CONVERTS FROM CHROMOPAINTER v1 FORMAT TO v2 usage: perl chromopainter2chromopainterv2.pl <phasefile> <outputphasefile> with: <phasefile>: ChromoPainter/PHASE style SNP file <outputphasefile>: Output phase file <options>: -p <val> : Ploidy -v: Verbose mode
11.4 phasescreen.plRemove non-varying SNPs and singletons from a PHASE file. This speeds execution of ChromoPainter and does not change the output. REMOVE SINGLETONS OR NON-SNPS FROM PHASE DATA usage: perl phasesscreen.pl <phasefile> <outputphasefile>
11.5 phasesubsample.plSubsamples phase-style data in a contiguous block. This is useful for pipeline generation and testing, although ChromoPainter now provides this facility with the -l <from> <to> format which you can specify in -s12args. EXTRACTS SNP RANGE FROM PHASE (CHROMOPAINTER) FORMAT usage: perl phasesubsample.pl <options> <from> <to> <phasefile> <outputphasefile> Extract the SNPs [from to] inclusive, i.e. 1 2 extracts the 1st and 2nd SNPs. where: <from>: First SNP to retain (1 is the first snp) <to>: Final SNP to retain (L is the last snp) <phasefile>: ChromoPainter/PHASE style SNP file, i.e. <outputphasefile>: Output phase file <options>: -v: Verbose mode NB Compatible with chromopainter and chromopainterv2 phase formats. Updated 6th June 2017 to fix an out-by-one error.
11.6 plink2chromopainter.pl (PLINK)Conversion script for going from PLINK (pngu.mgh.harvard.edu/~purcell/plink/) style PED and MAP files to ChromoPainter’s PHASE and RECOMBFILES files. IMPORTANT NOTE: Use plink -recode12 to get output in an appropriate format for this script!. Note that many plink commands can be used without losing phasing information, despite PLINK being phasing unaware. Usage: ./plink2chromopainter.pl -p=pedfile -m=mapfile -o=phasefile [-d=idfile] [-f] [-g=chromosomegap] [--quiet] [--asis] pedfile is a valid PLINK ped inputfile (DIPLOID) mapfile is a valid PLINK map file phasefile will be a valid chromopainter phase file (ChromoPainter's -g switch) (i.e. a fastphase file with an additional header line) idfile is OPTIONAL and simply stores the list of individual names (ChromoPainter's -t switch, but without the optional population and inclusion columns) YOU STILL NEED TO CREATE A RECOMBINATION FILE; either with convertrecfile.pl or makeuniformrecfile.pl. -f: Specify that the IDS from the FIRST column of the ped file (the family ID) should be used. The default is try the second and fall back to the first. -g chromosomegap (=10e6 by default) is the gap in BP placed between different chromosomes -a or --asis assume the SNPs are stored as 0/1 rather than 1/2 (default plink behaviour) -q or --quiet reduces the amount of screen output IMPORTANT: You should use the --recode12 option in plink MORE HELP ON FILE FORMATS: ./plink2chromopainter.pl -h
11.7 impute2chromopainter.pl (SHAPEIT format)Conversion script for going from IMPUTE2 format, which includes SHAPEIT (www.shapeit.fr) output, to ChromoPainter’s PHASE and RECOMBFILES files. CONVERTS PHASED SHAPEIT/IMPUTE2 OUTPUT TO CHROMOPAINTER-STYLE INPUT FILES usage: perl impute2chromopainter.pl <options> impute_output_file.haps output_filename_prefix where: (i) impute_output_file.haps = filename of IMPUTE2 output file with suffix ".haps" that contains phased haplotypes (ii) output_filename_prefix = filename prefix for chromopainter input file(s). The suffix ".phase" is added The output, by default, is in CHROMOPAINTER v2 input format. <options>: -J: Jitter (add 1) snp locations if snps are not strictly ascending. Otherwise an error is produced. <further options> NOTE: YOU ONLY NEED THESE OPTIONS FOR BACKWARDS COMPATABILITY! -v1: Produce output compatible with CHROMOPAINTER v1, i.e. include the line of "S" for each SNP. -f: By default, this script produces PHASE-style output, which differs from ChromoPainter input which requires an additional first line. This option creates the correct first line for standard fineSTRUCTURE usage (i.e. the first line is "0", all other lines are appended) NOTE: TO USE IN CHROMOPAINTER: You also need a recombination map. Create this with the "convertrecfile.pl" or "makeuniformrecfile.pl" scripts provided. !!! WARNING: THIS PROGRAM DOES NOT SUFFICIENTLY CHECK FOR MISSPECIFIED FILES. WE ARE NOT ACCOUNTABLE FOR THIS RUNNING INCORRECTLY !!!
11.8 beagle2chromopainter.pl (BEAGLE format)Conversion script for going from BEAGLE 3 format to Chromopainter PHASE format. CONVERTS PHASED BEAGLE OUTPUT TO CHROMOPAINTER-STYLE INPUT FILES usage: perl beagle2chromopainter.pl <options> beagle_phased_output_file output_filename_prefix where: (i) beagle_phased_output_file = filename of BEAGLE v3 or less (not vcf!) phased file (unzipped) that contains phased haplotypes (ii) output_filename_prefix = filename prefix for chromopainter input file(s). The suffixes ".phase" amd ".ids" are added The output, by default, is in CHROMOPAINTER v2 input format. NOTE THAT ONLY BIALLELIC SNPS ARE RETAINED, i.e. we omit triallelic and non-polymorphic sites. <options>: -J: Jitter (add 1) to snp locations if snps are not strictly ascending. Otherwise an error is produced. <further options> NOTE: YOU ONLY NEED THESE OPTIONS FOR BACKWARDS COMPATABILITY! -v1: Produce output compatible with CHROMOPAINTER v1, i.e. include the line of "S" for each SNP. -f: By default, this script produces PHASE-style output, which differs from ChromoPainter input which requires an additional first line. This option creates the correct first line for standard fineSTRUCTURE usage (i.e. the first line is "0", all other lines are appended) !!! WARNING: THIS PROGRAM DOES NOT SUFFICIENTLY CHECK FOR MISSPECIFIED FILES. WE ARE NOT ACCOUNTABLE FOR THIS RUNNING INCORRECTLY !!! NOTE: TO USE IN CHROMOPAINTER: You also need a recombination map. Create this with the "convertrecfile.pl" or "makeuniformrecfile.pl" scripts provided.
11.9 msms2cp.pl (MSMS and MS output format)Conversion script for going from data simulated by MS (home.uchicago.edu/rhudson1/source/mksamples.html) or its variants including MSMS (www.mabs.at/ewing/msms), to ChromoPainter’s PHASE and RECOMBFILES files. CONVERTS MSMS/SCRM/MS OUTPUT TO CHROMOPAINTER-STYLE INPUT FILES usage: perl msms2cp.pl <options> msmsoutput.txt output_filename_prefix OPTIONS -c1 : Output chromopainter version1 format -n <x> : Multiplier for the SNP locations (default: 1000000) -p <x> : Specify the ploidy (default:2 for diploid; needed only for CP version 1) -ms <x>: Specify ms mode, and give the number of ⋆haplotypes⋆ in it (because ms doesn't include this in the header) -v : Verbose mode |