📗 -> 01/28/25: ECS124-L7


🎤 Vocab

❗ Unit and Larger Context

Review before midterm

✒️ -> Scratch Notes

Last time

  • Covered BLAST
    • 2 main events in blast:
    1. Preprocessing set of functions/events. Does 2 main things.
      1. Creates HSP = Highly related 3 characters strings
        HSP( {ATT} ) -> ATA, ATC, CTA
        ATT -> ATA, ATC, CTA. For all 3 possible characters?
      2. All known sequences, breaks them into substrings of length 3, AND stores in a different table where that string length 3 is present.
        S1 = CTATATAT
        Would get grouped into the following 3 sequences:
        CTA, TAT, ATA, TAT, …
        S100 = ---x200--- ATA
ATA[s1, 3], [s1, 5], [s100, 201]
2) Running BLAST
	1) Takes sequences and breaks it into unique k-mers of size 3
	   GAATTATT
	   GAA, AAT, ATT, TTA, TAT
		   - Only include unique ones, not the second ATT

Store a table of all possible 3 chars:

Blossom 62 (or 82?) -

  • Our goal is to find sequences that max our query in thousands of sequences

20 possible protein sequences:

Optimality

OPT(i,j) = Max similarity/alignment score for sequence x[1...i] and y[1..j]

  • x = “happydayare”
  • y = “ppydayaretoday”
  • OPT(5,3): Max align between x=“happy” and y=“ppy”
  • OPT(i,j) = max(a,b,c)
    • Where a=matching characters, b and c are space

spaces and mismatches cost zero, matches+1

score(i,j) defined as: 
	if (x[i]==y[j]) Return 1
	else Return 0

What does the following statement describe, assuming its the optimal alignment?
OPT(i-3,j) + space(3)

Optimal similarity score for x(1…i) and y(1…j) is created by:

  • | x(1…i-3) | i-2 | i-1 | i |
  • | y(1…j) | - | - | - |
    Essentially saying the optimal alignment is matching up the last 3 letters of x (i-2, i-1, i) with spaces/gaps

if matches something in s2
else

The probability that we match something: 1/4

🧪 -> Refresh the Info

Did you generally find the overall content understandable or compelling or relevant or not, and why, or which aspects of the reading were most novel or challenging for you and which aspects were most familiar or straightforward?)

Did a specific aspect of the reading raise questions for you or relate to other ideas and findings you’ve encountered, or are there other related issues you wish had been covered?)

Resources

  • Put useful links here

Connections

  • Link all related words