Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Building and querying a kraken2 database
Highest Rated
Rating: 4.5 out of 5(18 ratings)
114 students

Building and querying a kraken2 database

A useful Next Generation Sequencing (NGS) Metagenomics tool
Last updated 2/2021
English

What you'll learn

  • how to design, build and query a kraken2 database
  • how to classify Next Generation Sequence reads to a metagenomics database
  • how to visualize a classification report in Pavian

Course content

4 sections20 lectures2h 0m total length
  • Introduction5:03

    Build a kraken2 database for metagenomics, matching sample DNA to reference sequences to identify pathogens and microbial communities in biomedical and environmental contexts.

  • Installing kraken22:49
  • Kraken2 database and query input9:42
  • Kraken2 database input
  • Search the SRA website for SARS-CoV-2 data sets
  • How to use sequences from outside NCBI2:58

    Learn to process sequences from outside NCBI by identifying the species with the NCBI taxonomy, obtaining the taxonomy ID, and adding an annotation line with two pipes and the ID.

  • Find NCBI taxon id for several species
  • The kraken2 algorithm8:34
  • Understanding the kraken2 algorithm
  • How many sequences to add to your database?4:12

Requirements

  • basic Linux knowledge
  • understand basic shell commands

Description

Kraken2 is a well-known Next Generation Sequencing (NGS) metagenomics classification tool. It is used widely in scientific research. With kraken2 you can build a database using whole genome sequences to classify read sequences against to identify unknown samples. It is easy to use and performs very rapid sample classification. Kraken2 has application in biology, medical research and even geology.

In this video we will see a general overview of the algorithm behind the kraken2 database language. Then, we will do a hands-on example whereby we download read data sets from the SRA database from NCBI (the National Center for Biotechnology Information) and classify them against the database that we have built. We will also look at the Pavian visualization software which presents a graphical overview of our results (i.e. Stankey diagram). Figures made in this program can also be used for publication purposes.

Students of this course are mainly either biology or medical students or researchers. Some knowledge of Linux is required to take the course, but there are only few commands that need clarification, but they will be explained in detail.

In total, it is worthwhile learning the skills used in this course, which can give you the edge in metagenomics research and analysis.

Who this course is for:

  • students studying bioinformatics, biotechnology