|
||
| About CDSParser |
|
File Types*.cgi*.cgi is the default extension GenBank uses when GBSeq XML files are downloaded. All GenBank files must be in GBSeq XML to work with CDSParser. *.fastaFASTA files are a type of sequence file. These files can be outputted from CDSParser for use in ClustalW and other programs. key (*.txt)Key files are outputted with FASTA file from CDSParser. They match the sequence labels in the FASTA files to the name of the sequence in a particular GenBank record. *.grp*.grp is the extension given to group files outputted from CDSParser. They store the groups and delete set as specified by the user. The first line is the list of gene names not to output. Each of the succeeding lines are a group of names. The names are delimited by a tilde (~) because I have not been able to find any gene names on GenBank that have a tilde in them. The group sets are useful when there are various names for the same gene, such as "cyt b", "cyt-b", and "cytochrome b", because CDSParser will treat them as the same name and output them in the same column (in tab-delimited files). See hantavirus.grp for an example. *.defclustal.def and defaults.def are CDSParser files used to locate ClustalW and specify which columns should be outputted in the tab-delimited files. defaults.def is provided to allow a user to have more control of the output of CDSParser. *.alnThis file type is an alignment file, and is one of the file types that will be outputted by ClustalW when it calculates a neighbor-joining tree. *.treClustalW can output neighbor-joining trees as Nexus trees. *.njThis is the format that Clustal uses for neighbor-joining trees. *.phClustalW can output neighbor-joining trees as Phylip trees. tab-delimitedTab-delimited files are outputted for use in spreadsheet programs. CDSParser outputs two types of tab-delimited files.
|