

The PISE programme is a robust environment that has been around for many years now and allows integration of internal scripts/tools that are part of the target execution environment as well as external tools that a user runs.
Bioedit code heterozygote software#
The availability of software environments for pipelining and workflow management help the user to create custom analysis pipelines.

Along with output parsing scripts, some degree of automation can be achieved in data analysis tasks. Format conversion scripts to convert output of one program to input of another are needed when the user wants to pipeline several tools and modules. Sequence analysis involves pipelining of data from one software to another and often also includes branched flows such as when annotation of sequences with putative function is also a requirement. The need for a module that could report SNP features for any number of user defined groups coupled with the need to be able to calculate statistics taking into consideration the presence of heterozygous loci led to the development of the SNP DIVersity ESTimator module ( divest.pm). As a result features such as sequence diversity, PIC of SNP and haplotypes, etc. Many available programs read heterozygous SNPs as missing/bad quality sequence data and thus do not consider them for analysis. None of them however automate group wise identification and reporting of polymorphism statistics and more importantly consider the presence of heterozygous loci in the sequence data. DnaSP reports on nucleotide polymorphism features from aligned sequence data. Similarly, there are many freely available software programmes for the identification of SNPs. The popular group of Clustal programmes, d2-cluster for EST clustering and cap3 or PCAP, the TIGR assembler or Phrap are used for sequence assembly. There are several available software solutions for sequence clustering, and a few popular ones for assembly. One solution for validating the identified SNP(s) through cost effective SNP genotyping platform is development of CAPS (cleaved amplified polymorphic sequences) marker by predicting the restriction enzyme that can use the identified SNP as a recognition site. Although more than 30 SNP genotyping platforms are currently available, these are both expensive and demand considerable expertise. Validation of predicted SNP(s) through wet lab experiments is the next step to convert the identified SNP into a genetic marker. Along with SNP identification it is also desirable to obtain other aspects from the alignment such as SNP and indel (insertion-deletions) frequency, the type of variant and haplotypes, PIC value for the SNP and haplotype besides nucleotide diversity (π). The user then processes this output to determine presence of microsatellites or SNPs. Sequence assembly is carried out to generate consensus sequences or contigs and singlets.


Sequence data analysis usually involves steps such as clustering of sequence data, to determine redundancy levels. The grouping could be based on the objective of the study – across race, location, genes or regions within genes. Sequence diversity information may be desirable across defined groups of sequences, such as candidate gene transcripts from different genotypes, or assembled transcripts for a particular marker from more than one genotype. This helps assess genome variation that can then be harnessed for crop improvement. The recent advances in sequencing and genotyping have made large scale SNP diversity analysis possible in several crop species. Single Nucleotide Polymorphisms (SNPs) are commonly found throughout the genome and provide dense maps over small chromosomal regions.
