● DNA and AA alphabets: DNA is a molecule, that contains genetic data. It based on the chemical alphabet with 4 characters: A, C, G, T. DNA is kind of long-term storage of life information.
● What are paired-end reads?: The both ends of DNA sequence will be read. The distance between fragments is known. This information can be used to increase the performance of genome assembly process.
● What's a genome?: Genome is the full genetic information of an organism. There are a coding parts of genome that encodes a specific proteins, and a non-coding parts, that have some other functions, such a regulation of protein production.
● Coding versus non-coding sequence data: Coding DNA is the DNA that encodes the protein information, the mRNA. Non-coding DNA are the DNA parts, that stores another RNA types like rRNA and tRNA.
● Name some model organisms: Drosophila and Gut Bacteria
● Why do we use model organisms?: The main reason is a short life cycle - we can observe a huge amount of generations in a short time. It is very convenient compared to the observation of long-living organisms. Other reasons - they are cheap, and there are no ethical borders. Also an organisms like some plants are often used because of short genome.
● DNA: what's the 3' and 5' end?: this is the numbers of chemical groups of DNA backbone. By convention we read DNA sequences from 5 prime to 3 prime.