Software

The programs on this page may be freely used, modified, and shared under the GNU General Public License version 3.0 (GPL-3.0).




YBYRÁ facilitates comparison of large phylogenetic trees

The YBYRÁ project integrates software solutions for data analysis in phylogenetics. It comprises tools for (1) topological distance calculation based on the number of shared splits or clades, (2) sensitivity analysis and automatic generation of sensitivity plots (“Navajo rugs”) and (3) clade diagnoses based on different categories of synapomorphies (using TNT). YBYRÁ also provides (4) a framework to facilitate the search for potential rogue taxa based on how much they affect average matching split distances (using MSdist).

Citation: Machado D.J. 2015. YBYRÁ facilitates comparison of large phylogenetic trees. BMC Bioinformatics 16:204. doi:10.1186/s12859-015-0642-9

Financial support: FAPESP Proc. No. 2009/13561-5, 2012/10000-5 and 2013/05958-8.

Download Manual Gitlab

Image for VirtualBox: READ-ME Download


AWA: test completeness of assemblies of circular DNA based on short sequence reads

The AWA project integrates software solutions for testing the completeness of assemblies of circular DNA based on short sequence reads. A new version of AWA is currently being prepared and will accompany a discussion of its applications for all types of circular DNA. The current version should be considered a draft and has only been tested for mitogenome assemblies of frog DNA.

Financial support: FAPESP Proc. No. 2013/05958-8 and 2015/18654-2.

Download Gitlab Manual


sudoParallelGarli.py

sudoParllelGarli is a collection of programs written in Python script that can be used to run multiple Garli processes at the same time. The original program, sudoParallelGarli4pbs.py, was designed for our own cluster, ACE, and leverages on the PBS job scheduler. The newest program, sudoParallelGarli4bash.py, runs without a job scheduler.

Download READ-ME Gitlab


selectTiles.py

As stated by Yang et al. (2013: 14), “to get reliable result in downstream analysis, it is necessary to remove low quality reads avoiding mismatches in read mapping, and false paths during genome assembly”. Due to its function versatility or run-time efficiency, we have selected the HTQC toolkit (Yang et al. 2013) to perform reads quality assessment and filtration. The complete quality control protocol is described bellow.

The program “selectTiles.py” automatizes the selection of tiles to be removed after running "ht-stat", following criteria based in the HTQC guidelines:

(1) more than 50% of the reads have quality score bellow 10;

(2) less than 10% of the reads have quality greater than 30;

(3) most reads have quality bellow 20.

Users can change these criteria at will or even employ additional conditions, as it is explained at the beginning of the program. Whenever necessary, selected tiles were removed with “ht-filter”.

Download Gitlab


parseCaf.py

CAF is a text format for describing sequence assemblies. It is acedb-compliant and is an extension of the ace-file format used earlier, but with support for base quality measures and a more extensive description of the Sequence data. The program parseCaf.py parses padded CAF sequence files for DNA data. CAF files must have only one contig per file. Execute "python parseCaf.py --help" to see all arguments available.

Download Gitlab


PAckage of TOols with Fast(a/q) Utilities (PATO-FU)

The PAckage of TOols with Fast(A/Q) Utilities (PATO-FU) is a collection of homemade Python scripts take can be useful for dealing with raw sequencing data in FASTA or FASTQ format. The programs can also accept gunzip'ed files as input.

Financial support: FAPESP Proc. No. 2013/05958-8.

Download Quick reference Gitlab Manual


A parallel application tool originally develop to extract reads that align against a local database

The programs in this tool kit allows the user to search form reads in a large sequence file (FASTA or FASTQ) that aligns against a particular local database. Alignments are parallelized with Python's multiprocessing.

Financial support: FAPESP Proc. 2012/10000-5 and 2013/05958-8.

Download Manual Gitlab


redux: a parallel application tool for duplicate removal

The redux program can paralellize the search for duplicates sequences in FASTA or FASTQ formated files.

Financial support: FAPESP Proc. 2012/10000-5 and 2013/05958-8.

Download Manual Gitlab


Other Python scripts

This zip file contains a variety of Python scripts for multiple purposes Usage information os provided as comments in the beginning of each file.

Download List of files