Quick Start
-
Specific the path of references (.fasta) and samples (.fastq) in a configure file (.YAML).
For example, write down and save the following block into a text file and named it as
data.yaml
.reference: contamination: fa: ./ref/contamination.fa genes: fa: ./ref/genes.fa genome: fa: /data/reference/genome/Mus_musculus/GRCm39.fa star: /data/reference/genome/Mus_musculus/star/GRCm39.release108 samples: mESCWT-rep1-input: data: - R1: ./test/IP16.fastq.gz group: mESCWT treated: false mESCWT-rep1-treated: data: - R1: ./test/IP4.fastq.gz group: mESCWT treated: true mESCWT-rep2-treated: data: - R1: ./test/IP5.fastq.gz group: mESCWT treated: true
You can also copy and edit from this template.
Read the more details on how to customize.
-
Run all the analysis by one command:
apptainer run docker://y9ch/bidseq
default
The pipeline will load configure file named
data.yaml
under the current directory.How to run apptainer on computation nodes without internet acess?
-
(On the login node with internet connection) Run
module load apptainer
to mount the apptainer utils, if it is not installed by default. -
(On the login node with internet connection) Build the
bidseq_latest.sif
file using the commandapptainer pull docker://y9ch/bidseq
. -
(On the computation node) Run
apptainer run bidseq_lastest.sif -c data.yaml
to start the pipeline. Note that most HPC systems mount directories in a complex manner. Therefore, you need to find out the actual path by executingrealpath ./
and specify this output intoapptainer
usingapptainer run -B /the/real/path ...
If your configure file is not named as
data.yaml
, add-c your_file_name.yaml
arg after the command to customize. -
-
View the analytics reports and filtered sites.
default
3 folder will be created in the working directory (default:
workspace
),- trimming, mapping, deduping reports are in
report_reads
folder, with key numbers in all the steps reported in one webpage(example). - filtered sites for Ψ sites detection are in
filter_sites
folder. These sites are only passed the simplest filtering, you can apply customized threshold into them based your data type and quality. - processed mapping results (.bam) are in
align_bam
folder. You can zoom into location that you interested in IGV.
- trimming, mapping, deduping reports are in