Review 1
Blast
- Table:
- Split into kmers as keys
- Entries into hash table are storing where the kmers are in the original sequence
| ATT | (s1,1), (s3,1) |
|---|---|
| TTA | (s1, 2), (s3, 3) |
| TAT | (s1, 3), (s2, 1), (s3, 4) |
| ATG | (s2, 2) |
| TGT | (s2, 3) |
| … |
Bayes Method
K Means
- Find smallest distance to each cluster
- Reassign each cluster to average of things closest to it
- Algo ends when clusters dont move
Genome Assembly
Suffix Tree
UPGMA
de Bruin graph to find the Hamiltonian path
- kmers of 3:
- ACTAG: (ACT) -> (CTA) -> (TAG)
- Create nodes of length 3 that point to a possible following sequence