For the bacterial database (because it is over 200Gb) you will need to first split it into managable chunks before BWA can use them. Can use fasta splitter (http://kirill-kryukov.com/study/tools/fasta-splitter/).
The files need to be under 3Gb each, so split it to as many chunks as you need. I did 100.
# ---------------------------------------------------------------------
# ---------------------------------------------------------------------
#SBATCH --job-name=fasta_split_bac
#SBATCH --account=def-account
#SBATCH --cpus-per-task=16
# ---------------------------------------------------------------------
echo "Current working directory: `pwd`"
echo "Starting run at: `date`"
# ---------------------------------------------------------------------
perl /pathto/fasta-splitter.pl --n-parts 100 Bacteria_FG_split_prinseq.fasta
# ---------------------------------------------------------------------
echo "Job finished with exit code $? at: `date`"
# ---------------------------------------------------------------------
Next step is to index the databases You must use the BWA provided in the DeconSeq package!
The newer BWA reads the files incorrectly for this and will only produce 5 of the 8 outfiles necessary. Cedar is a 64 bit Linux system, so use bwa64.
Modified from script kindly provided by Dr. Stefan Dennenmoser
# ---------------------------------------------------------------------
# ---------------------------------------------------------------------
#SBATCH --job-name=index_bac
#SBATCH --account=def-srogers
#SBATCH --mem-per-cpu=20G
# ---------------------------------------------------------------------
echo "Current working directory: `pwd`"
echo "Starting run at: `date`"
# ---------------------------------------------------------------------
cd /path/to/Bacterial_Database
filename=`ls -1 *.fasta* | tail -n +${SLURM_ARRAY_TASK_ID} | head -1`
filename2=${filename::-6} #the filename without the .fasta part (-6 letters)
/path/to/deconseq-standalone-0.4.3/bwa64 index -p $filename2 -a bwtsw $filename
# ---------------------------------------------------------------------
echo "Job finished with exit code $? at: `date`"
# ---------------------------------------------------------------------