SID-Chart

An automatic pipeline for the BGC characterization of genomes. Moreover, it should integrate further data layers, such as a phylogenetic tree and the similarity to specific proteins.

Prerequisites

Nextflow installed
Docker installed
Docker images for:
- antiSMASH
- MMseqs2
- BiG-SCAPE
- chewBBACA
- MEGA-CC
Streamlit installed
pandas installed
seaborn installed
matplotlib installed
Java 17 or later
Pyhton 3.0

Input

SID-Chart expects input data in the following format.

└── Data
    ├── dataset_staphy
        ├── ncbi_dataset
            ├── GCA_000433035.1
                ├── GCA_000433035.1_MGS324_genomic.fna
                ├── genomic.gbff
                └── protein.faa
            ├── ...
            └── ...
    ├── reference_BGCs
        ├── BGC0000943.gbk
        ├── ...
        └── ...
    ├── reference_genome
        └── GCF_001027105.1_ASM102710v1_genomic.fna
    ├── metadata.tsv
    ├── overviewBGCs.csv
    ├── proteins_uptake.fa
    └── Staphylococcus_aureus.trn

ncbi_dataset/ — Contains information about all species to be analyzed.
reference_BGCs/ — Contains all biosynthetic gene clusters (BGCs) to be included in the analysis.
reference_genome/ — Contains a FASTA file of a reference species required by chewBBACA.
metadata.tsv — Lists the accession numbers and corresponding NCBI organism names.
overviewBGC.csv — Provides a mapping between BGCs and lipoproteins.
proteins_uptake.fa — Contains the lipoproteins to be analyzed.
Staphylococcus_aureus.trn — A Prodigal training file required by chewBBACA (can be created using Pyrodigal)

File and folder naming expected by SID-Chart can be customized via nextflow.conf.

Run Pipeline

Check that the input file names correspond to the default parameters defined in nextflow.conf.
If they differ, either modify the values in nextflow.conf or provide the correct file names as arguments in run_pipeline.sh.
Set the input directory inside the run_pipeline.sh script by defining it as the --input parameter in the Nextflow command.
Inside the nf directory run:

./run_pipeline.sh [RUN_NAME]

Run Visualization

Inside the web directory run:

./run_visualization.sh [RUN_NAME]

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
nf		nf
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
conda_env.yaml		conda_env.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SID-Chart

Prerequisites

Input

Run Pipeline

Run Visualization

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Integrative-Transcriptomics/SID-Chart

Folders and files

Latest commit

History

Repository files navigation

SID-Chart

Prerequisites

Input

Run Pipeline

Run Visualization

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages