DNA and AA alphabets: DNA is a molecule, that contains genetic data. It based on the chemical alphabet with 4 characters: A, C, G, T. DNA is kind of long-term storage of life information.

What are paired-end reads?: The both ends of DNA sequence will be read. The distance between fragments is known. This information can be used to increase the performance of genome assembly process.

What's a genome?: Genome is the full genetic information of an organism. There are a coding parts of genome that encodes a specific proteins, and a non-coding parts, that have some other functions, such a regulation of protein production.

Coding versus non-coding sequence data: Coding DNA is the DNA that encodes the protein information, the mRNA. Non-coding DNA are the DNA parts, that stores another RNA types like rRNA and tRNA.

Name some model organisms: Drosophila and Gut Bacteria

Why do we use model organisms?: The main reason is a short life cycle - we can observe a huge amount of generations in a short time. It is very convenient compared to the observation of long-living organisms. Other reasons - they are cheap, and there are no ethical borders. Also an organisms like some plants are often used because of short genome.

DNA: what's the 3' and 5' end?: this is the numbers of chemical groups of DNA backbone. By convention we read DNA sequences from 5 prime to 3 prime.