Skip to content

AdamCicherski/AlfaPang

Folders and files

NameName
Last commit message
Last commit date

Latest commit

90bac46 · Mar 18, 2025

History

89 Commits
Mar 13, 2025
Mar 8, 2025
Mar 11, 2025
Mar 14, 2025
Mar 10, 2025
Mar 13, 2025
Mar 10, 2025
Mar 8, 2025
Mar 18, 2025
Jun 7, 2024
Mar 14, 2025
Mar 11, 2025

Repository files navigation

AlfaPang

AlfaPang constructs variation graphs, leveraging its alignment-free and reference-free approach, based solely on intrinsic sequence properties. This design allows AlfaPang's runtime and memory usage to scale linearly with the size of input sequences, enabling it to handle significantly larger genome sets compared to other methods.

Instalation

To install AlfaPang, download the repo, then navigate to the project root and run following commands:

    git submodule update --init --recursive  
    mkdir build  
    cd build  
    cmake ..  
    cmake --build .

Usage

To run the program, use the following command format:

./AlfaPang <input_fasta> <output_gfa> <k>

Note: The value of k must be an odd integer.

Example

Download example data with following command:

wget https://zenodo.org/records/7937947/files/ecoli50.fa.gz
gzip -d ecoli50.fa.gz 

Then, try::

./AlfaPang ecoli50.fa ecoli50_ap.gfa 47

Refining AlfaPang graphs is highly recommended. We suggest using smoothxg and gfaffix. Both tools with dependencies can be easily installed following pggb repository. Basic commands for graph refainment:

smoothxg -g ecoli50_ap.gfa -r 50 -V -o ecoli50_ap_smooth.gfa
gfaffix ecoli50_ap_smoth.gfa -o ecoli50_final.gfa

Parameter k choice

We suggest choosing parameter k based on the fraction of rare k -mers (those occurring only once in the k -mer spectrum). In our tests, values of k yielding around 5% rare k -mers result in a reasonable graph structure.

For this purpose, we provide the script kmer_fractions.sh, which uses the disk-based k -mer counter KMC. The script produces a .tsv file with the fraction of rare k -mers calculated for a given k range.

./AlfaPang/scripts/kmer_fractions.sh -i <input_fasta_file> -o <output_dir_name> -k <min_k_value> -K <max_k_value> -s <step> 

Note: If KMC executable path in your system is different than ${HOME}/kmc/bin/kmc modify script variable KMC_PATH.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published