A 5 Minutes Introduction On Using SMARTPOPΒΆ
Once installed, you should be ready to run SMARTPOP.
Create a new directory for your project (e.g. qstart).
mkdir qstart
cd qstart
Remember to call SMARTPOP from where it is located, or include the software in your PATH. The following command line will start 100 simulations with 1000 individuals. Use the -v flag (verbose) to generate comments from the software in the terminal.
./smartpop -p 1000 -nsimu 100 -v
Control the sequence length (per locus) you wish to simulate with the -sizeMt (mitochondria), -sizeY (Y chromosome), -nbLociX (X chromosome), -sizeX, -nbLociA (autosomes) and -sizeA flags. Let's simulate sequences on the mitochondrial DNA, non recombining Y, and X chromosome, as well as two autosomal loci, all of size 320bp.
smartpop -p 1000 -nsimu 100 -v -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320
It is also possible to add the number of generations to run for using the -t flag (e.g. 200 generations).
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320 -t 200 -v
Now control which ouputs to produce. The most simple ones are estimators of diversity computed by SMARTPOP and measured at the end of the simulation.
By default, these output files will be named smartpop_X_YZ_div with X being the random seed (by default will vary each time you launch a new simulation), Y being the mating system and Z being the DNA type for which diversity is calculated. Check in your folder to see these files from the previous examples.
In the next example, we will only output the values for mitochondrial DNA and Y chromosome.
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320 -t 200 -mtdiv -ydiv -v
If you only wish to look at the mitochondrial DNA and Y chromosome patterns, you can remove the X chromosome and autosomes from the simulations by setting their locus size or their number of locus to 0.
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 0 -nbLociA 0 -t 200 -mtdiv -ydiv -v
It is recommended that you rename your files to something meaningful. In this case, let's rename the output qstart_simu:
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320 -t 200 -mtdiv -ydiv -o qstart_simu -v
Check in the folder too see that two new files have been created: qstart_simuMt and qstart_simuY. The first file contains diversity values for the mitochondrial DNA dataset; the second is for the Y chromosome.
To run the results of simulations through other software (such as Compute from libSequence), output DNA sequence for the population, at the end of the run in fasta format. The option -o qstart_simu will provide the root name for the file name. A suffix Z_X.fasta will be be added to this name with X being the simulation number and Z being the type of DNA. This time, four files are created per simulation, one per type of DNA. Keep in mind that this will create a large number of files.
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320 -t 200 -fasta -o qstart_simu -v
When using Compute or other software to measure population genetic estimators, the diversity files from SMARTPOP may become unnecessary. The flag -nodiv will prevent their creation, which makes the program run faster.
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320 -t 200 -fasta -o qstart_simu -nodiv -v
Estimators can also be calculated on a sample instead of the whole population, which better represents real data. Choose the option -sample X with X being the sample size to produce all the chosen outputs (fasta, Arlequin or SMARTPOP diversity files) for only a random sample of the population.
smartpop -p 1000 -nsimu 100 -sizeMt 320 -sizeY 320 -nbLociX 1 -sizeX 320 -nbLociA 2 -sizeA 320 -t 200 -fasta -o qstart_simu_sampled -sample 10 -v
For more information about how to run SMARTPOP, see the examples , the full manual and the original paper.
For any use of SMARTPOP or re-use of its code source please cite: Guillot and Cox 2014. SMARTPOP: inferring the impact of social dynamics on genetic diversity through high speed simulations. BMC Bioinformatics 15:175