π -> 01/07/25: ECS124-L1
[Lecture Slide Link]
π€ Vocab
β Unit and Larger Context
Important to Frid, her background is there
3-4 assignments
4-5 timed online quizzes
2 midterms
1 final
Goals
- Role / Utility of bioinformatics in modern bio
- Learn basic biological, mathamatical, and algorithmic concepts
- Master common bioinformatical terms
- Do simple programming
- PERL??? LOL
- Other languages too (striving for language agnostic)
- Computational Biology vs. Bioinformatics
- Scope of Comp. Biology?
- Modeling processes: modeling based on biological knowledge
- Bioinformatics?
- Applications of algorithms and statistical techniques to interpret biological data
=> leads to understanding biological organization of behavior
How we interpret / obtain info in biology:
- Applications of algorithms and statistical techniques to interpret biological data
- Scope of Comp. Biology?
- Observation driven
- Hypothesis -> Exp.
- Theory
or⦠- Large scale, high-throughput data accumulation
- Also a form of observation! Just done with a computer in place of human
βοΈ -> Scratch Notes
T1. Sequence alignment
Used in many places:
- Phylogeny, also species finding
- Relationships
- Migration projects
- Genome sequencing
- Genome assembly
Algorithmically: - Dynamic programming
T2. Database Search
Used:
- BLAS, searches all DNA
- Function similarity (if other similar things create red blood cells, likely this one does too)
- Motif finding
Algorithmically: - Hashing alignment
T3. Gene Assembly
Used:
Algorithmically:
- Graph algorithms
- Alignment
Gene assembly and alignment:
- Copies strand
- Blows it up
- Reads characters chunks at a time (100)
- If you can read enough characters and find where they overlap, you can rebuild the original
Gluing them together (rebuilding) is graph theory
T4. Exact String Matching
Uses:
- Finding transposable elements
- Gene search
Algorithmically: - 122B: Suffix Trees
MACHINE LEARNING SECTION
1. Clustering
K-Means, Hierarchical Clustering
2. Classification
Naive Bayes,
Have background data on DNA of people with and without cancer
- Use ML to predict future DNA of unlabeled people
- Estimate probability of both possibilities
Later find out about Brocaβs Gene associated
3. Phyloenetics
Jukes cantor?
Ultra metric trees
Nearest Neighbors
4. HMM DP
Tree of knowledge expansion?
Biological knowledge - DNA mutates
|
Biological Model - Replicate, mutation rate
|
Mathematical Model and Assumptions - Comp. Biology
|
Mathematical Problem, I/O Description
|
Algorithmic Problem / Solution - Transforms input INTO output
- This level is the focus of the course, high level algorithmic overview
|
Programming problem - Grunt work of building the code out - Done in labs
Counting reminder:
Stages of experiment
If experiment has multiple stages stage
- Stage
# outcomes is
Total sample space size is
EX) If sequence,
Picking
How many ways can i select president and VP from a sample 130?
- =
What about just 2 founders, no roles?
- Dividing by the possible reorderings, that are equivalent orderless
- = 130 * 129 / 2
How many ways to create out of 140 subjects such mating group consists of 3 subjects
- 140 choose 3.
How many possible mitochondrial sequences (hypothetically 300 long) can exist? Probability of 2 identical sets?
Sequence Alignment
βFACTβ (assumption)
- High DNA / RNA / Protein sequence similarity implies significant function or structural similarity
- Similar sequence implies similar function/structure
- In RNA it might not be the function, but the structure implied
π§ͺ -> Refresh the Info
Did you generally find the overall content understandable or compelling or relevant or not, and why, or which aspects of the reading were most novel or challenging for you and which aspects were most familiar or straightforward?)
Did a specific aspect of the reading raise questions for you or relate to other ideas and findings youβve encountered, or are there other related issues you wish had been covered?)
π -> Links
Resources
- Put useful links here
Connections
- Link all related words