src/clustal/seq.h File Reference
#include "squid/squid.h"
#include "util.h"
Include dependency graph for seq.h:
This graph shows which files directly or indirectly include this file:
Go to the source code of this file.
Data Structures
structure for storing multiple sequences
More...
Defines
Functions
Stripped down version of squid's alistat.
void
AddSeq (
mseq_t **prMSeqDest_p, char *pcSeqName, char *pcSeqRes)
Creates a new sequence entry and appends it to an existing mseq structure.
Swap two sequences in an
mseq_t structure.
Dealigns all sequences in mseq structure, updates the sequence length info and sets aligned to FALSE.
convert int-encoded iSeqType to string
int
ReadSequences (
mseq_t *prMSeq_p, char *pcSeqFile, int iSeqType, int iSeqFmt, int iMaxNumSeq, int iMaxSeqLen)
reads sequences from file
allocate and initialise new
mseq_t
Frees an
mseq_t and it's members and zeros all members.
copies an mseq structure
debug output of sqinfo struct
Write alignment to file.
Removes all gap-characters from a sequence.
Shuffle mseq order.
Sort sequences by length.
Appends an mseq structure to an already existing one. filename will be left untouched.
Checks if sequences in given mseq structure are aligned. By definition this is only true, if sequences are of the same length and at least one gap was found.
Define Documentation
#define AMINOACID_ANY 'X'
#define NUCLEOTIDE_ANY 'N'
#define SEQTYPE_PROTEIN kAmino
#define SEQTYPE_UNKNOWN kOtherSeq
int-encoded sequence types. these are in sync with squid's seqtypes and only used for convenience here
Function Documentation
void AddSeq
(
mseq_t **
prMSeqDest_p,
char *
pcSeqName,
char *
pcSeqRes
)
Creates a new sequence entry and appends it to an existing mseq structure.
- Parameters:
-
[out] prMSeqDest_p Already existing and initialised mseq structure
[in] pcSeqName sequence name of the sequence to add
[in] pcSeqRes the actual sequence (residues) to add
- Note:
- Don't forget to update the align and type flag if necessary!
FIXME allow adding of more features
void AliStat
(
mseq_t *
prMSeq,
bool
bSampling,
bool
bReportAll
)
Stripped down version of squid's alistat.
- Parameters:
-
[in] prMSeq The alignment to analyse
[in] bSampling For many sequences: samples from pool
[in] bReportAll Report identities for all sequence pairs
Don't have to worry about sequence case because our version of PairwiseIdentity is case insensitive
mseq to squid msa
FIXME code overlap with WriteAlignment. Make it a function and take code there (contains more comments) as template
void CopyMSeq
(
mseq_t **
prMSeqDest_p,
)
copies an mseq structure
- Parameters:
-
[out] prMSeqDest_p Copy of mseq structure
[in] prMSeqSrc Source mseq structure to copy
- Note:
- caller has to free copy by calling FreeMSeq()
void DealignMSeq
(
mseq_t *
mseq
)
Dealigns all sequences in mseq structure, updates the sequence length info and sets aligned to FALSE.
- Parameters:
-
[out] mseq The mseq structure to dealign
void DealignSeq
(
char *
seq
)
Removes all gap-characters from a sequence.
- Parameters:
-
[out] seq Sequence to dealign
- Note:
- seq will not be reallocated
int FindSeqName
(
char *
seqname,
)
- Parameters:
-
[in] seqname The sequence name to search for
[in] mseq The multiple sequence structure to search in
- Returns:
- -1 on failure, sequence index of matching name otherwise
- Warning:
- If sequence name happens to be used twice, only the first one will be reported back
void FreeMSeq
(
mseq_t **
mseq
)
Frees an mseq_t and it's members and zeros all members.
- Parameters:
-
[in] mseq mseq_to to free
- Note:
- use in conjunction with NewMSeq()
- See also:
- new_mseq
void JoinMSeqs
(
mseq_t **
prMSeqDest_p,
)
Appends an mseq structure to an already existing one. filename will be left untouched.
- Parameters:
-
[in] prMSeqDest_p MSeq structure to which to append to
[out] prMSeqToAdd MSeq structure which is to append
void LogSqInfo
(
SQINFO *
sqinfo
)
debug output of sqinfo struct
- Parameters:
-
[in] sqinfo Squid's SQINFO struct for a certain seqeuence
- Note:
- useful for debugging only
void NewMSeq
(
mseq_t **
prMSeq
)
allocate and initialise new mseq_t
- Parameters:
-
[out] prMSeq newly allocated and initialised
mseq_t
- Note:
- caller has to free by calling FreeMSeq()
- See also:
- FreeMSeq
int ReadSequences
(
mseq_t *
prMSeq,
char *
seqfile,
int
iSeqType,
int
iSeqFmt,
int
iMaxNumSeq,
int
iMaxSeqLen
)
reads sequences from file
- Parameters:
-
[out] prMSeq Multiple sequence struct. Must be preallocated. FIXME: would make more sense to allocate it here.
[in] seqfile Sequence file name. If '-' sequence will be read from stdin.
[in] iSeqType int-encoded sequence type. Set to SEQTYPE_UNKNOWN for autodetect (guessed from first sequence)
[in] iMaxNumSeq Return an error, if more than iMaxNumSeq have been read
[in] iMaxSeqLen Return an error, if a seq longer than iMaxSeqLen has been read
- Returns:
- 0 on success, -1 on error
- Note:
- Depends heavily on squid
- Sequence file format will be guessed
- If supported by squid, gzipped files can be read as well.
bool SeqsAreAligned
(
mseq_t *
prMSeq
)
Checks if sequences in given mseq structure are aligned. By definition this is only true, if sequences are of the same length and at least one gap was found.
- Parameters:
-
[in] prMSeq Sequences to check
- Returns:
- TRUE if sequences are aligned, FALSE if not
void SeqSwap
(
mseq_t *
prMSeq,
int
i,
int
j
)
Swap two sequences in an mseq_t structure.
- Parameters:
-
[out] prMSeq Multiple sequence struct
[in] i Index of first sequence
[in] j Index of seconds sequence
const char* SeqTypeToStr
(
int
iSeqType
)
convert int-encoded iSeqType to string
- Parameters:
-
[in] iSeqType int-encoded sequence type
- Returns:
- character pointer describing the sequence type
void ShuffleMSeq
(
mseq_t *
mseq
)
Shuffle mseq order.
- Parameters:
-
[out] mseq mseq structure to shuffle
void SortMSeqByLength
(
mseq_t *
prMSeq,
const char
cOrder
)
Sort sequences by length.
- Parameters:
-
[out] prMSeq mseq to sort by length
[out] cOrder Sorting order. 'd' for descending, 'a' for ascending.
int WriteAlignment
(
mseq_t *
mseq,
const char *
pcAlnOutfile,
int
outfmt
)
Write alignment to file.
- Parameters:
-
[in] mseq The
mseq_t struct containing the aligned sequences
[in] pcAlnOutfile The name of the output file
[in] outfmt The alignment output format (defined in squid.h)
- Returns:
- Non-zero on error
- Note:
- We create a temporary squid MSA struct in here because we never use it within clustal. We might be better of using the old clustal output routines instead.