Information Retrieval and Mining Massive Data Sets
3.7 (73 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,595 students enrolled

Information Retrieval and Mining Massive Data Sets

Learn various techniques to build a Google scale Information Retrieval System.
3.7 (73 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,595 students enrolled
Last updated 4/2014
English
English [Auto-generated]
Current price: $13.99 Original price: $19.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 39 hours on-demand video
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • The course is primarily divided into 6 parts.
  • Part 1: Building an Information Retrieval System
  • Part 2: Mining Frequent Patterns and Associations
  • Part 3: Classification and Clustering
  • Part 4: Web Mining
  • Part 5: Recommendation Systems
Course content
Expand all 123 lectures 39:08:34
+ Introduction To a Boolean Search Engine
15 lectures 02:57:52

In This Video We Describe What Is Data Mining

Preview 07:59

In This Video We Talk About Structured Data, Unstructured Data and Information Retrieval

Structured Data, Unstructured data and Information Retrieval
16:57

In This Video We Describe About Term Document Incidence Matrix Part No 1

Term-Document Incidence Matrix (1)
06:37

In This Video We Describe About Term Document Incidence Matrix Part No 2

Term-Document Incidence Matrix (2)
05:53

In This Video We Talk About Inverted Index

Inverted Index
17:14

In This Video We Talk About Tradeoffs In Implementing An Inverted Index

Tradeoffs in implementing an Inverted Index
13:07

In this video we describe about Processing AND OR NOT Queries

Processing AND, OR, NOT queries
19:11

In This Video We Describe About Overview Of Index Construction Pipeline

Overview of Index Construction Pipeline
19:10

In This Video We Describe About Query Optimization Using Document Frequency 1

Query optimization using Document Frequency (1)
09:54

In This Video We Talk About Query Optimization Using Document Frequency 2

Query Optimization Using Document Frequency (2)
11:28

In This Video We Describe About Boolean Retrieval Model

Preview 12:22

In This Video We Describe About Example Of A Boolean Retrieval Model

Example of a Boolean Retrieval Model
16:00

In this video we describe about Limitations Of A Boolean Retrieval Model

Limitations of Boolean Retrieval Model
06:53

In this video we talk about How To Evaluate Performance Of An IR System

How to evaluate performance of an IR System
09:39
Google zeitgeist
05:28
+ Dictionary Data Structure. Tolerant retrieval
8 lectures 03:22:03

In this video we talk about Parsing Document

Parsing Documents and Issues Associated with it
32:57

In this video we describe about Tokenization process in an ir system

Tokenization Process in an IR System
47:03

In this video we say about Normalization To Terms

Preview 59:32

In This Video We Describe About Faster Postings Merges With Skip Pointers

Faster Postings Merges With Skip Pointers
15:01

In this video we describe about How To Handle Phrase Query

How to Handle Phrase Query
12:50

In this video we describe about Phrase Query Using Positional Index

Phrase Query Using Positional Index
20:02

In this video we say about How To Handle Proximity Query

How to handle proximity query
03:18

In this video we talk about Discussion On Positional Index Size

Discussion on Positional Index Size
11:20
+ Index construction. Postings size estimation, sort-based indexing, dynamic index
15 lectures 05:22:53

In this video we describe about Dictionary Data Structure Implementation

Dictionary Data Structure Implementation
36:03

In this video we describe about Wild Card Queries

Wild card queries
09:02

In this video we describe about Question On Wild Card Queries

Questions on Wild Card Queries
17:07

In this video we talk about Wild Card Query Handling Using Permuterm Index

Wild Card Query Handling Using Permuterm Index
29:25

In this video we talk about Wild Card Query Handling Using K-Gram Index

Wild Card Query Handling Using K-Gram Index
16:13

In this video we talk about Soundex Algorithm

Soundex Algorithm
11:47

In this video we describe about Spelling Correction

Spelling Correction Techniques in an IR System
10:42

In this video we describe about Question On Soundex Algorithm

Question On Soundex Algorithm
08:35

In this video we describe about Spelling Correction Intro

Spelling Correction (Part 2)
17:52

In this video we talk about Intro To Dynamic Programming

Introduction To Dynamic Programming
09:47

In this video we talk about How To Calculate Edit Distance Between Two Strings

How To Calculate Edit Distance Between Two Strings
01:03:13

In this video we describe about Spelling Correction Using Weighted Edit Distance

Spelling Correction Using Weighted Edit Distance
24:13

In this video we describe about Spelling Correction Using Ngram Overlap Technique

Spelling Correction Using Ngram Overlap Technique
21:34

In this video we say about Calculating Jaccard Coefficient

Calculating Jaccard Coefficient (An Example)
34:38

In this video we talk about Context Sensitive Spell Correction

Context Sensitive Spell Correction
12:42
+ Dictionary Compression, Posting Compression
11 lectures 03:19:46

In this video we talk about Index Construction Introduction

Introduction to Index Construction
29:30

In this video we describe Index Construction Using In Memory Sorting

Index Construction Using InMemory Sorting
13:10

In this video we talk about Index Construction Using BSBI Algorithm

Index Construction Using BSBI Algorithm
43:33

In this video we talk about Index Construction Using SPIMI Algorithm

Index Construction Using SPIMI Algorithm
12:32

In this video we describe about Introduction To Distributed Indexing

Introduction To Distributed Indexing
10:38

In this video we describe about How To Build Distributed Indexes

Preview 25:02

In this video we talk about QA On Distributed Index

Q & A on Distributed Index
10:00

In this video we say about Map Reduce

Map Reduce
24:19

In this video we talk about Dynamic Indexing Using Naive Approach

Dynamic indexing using naive approach
11:03

In this video we talk about Dynamic Indexing Using Logarithimic Merge

Dynamic indexing using logarithimic merge
17:02

In this video we describe about Issues With Multiple Indexes

Issues With Multiple Indexes
02:57
+ Scoring, term weighting, and the vector space model
5 lectures 02:54:51

In this video we describe about Why Compression Indexes

Why do we compress indexes
07:09

In this video we talk about RCV Collection Statistics

Important Statistics about RCV Collection
41:41

In this video we describe about Dictionary Compression Technique

Various Dictionary Compression Techniques
31:44

In this video we say about Various Dictionary Compression Techniques Part 2

Various Dictionary Compression Techniques Part 2
35:07

In this video we describe about Various Posting Compression Techniques

Various Posting Compression Techniques
59:10
+ Efficient vector space scoring. Nearest neighbor techniques
10 lectures 02:46:31

In this video we talk about Ranked Retrieval Model

Preview 13:30

In this video we describe about Jaccard Score

Jaccard Score
14:05

In this video we describe about Term Frequency Weighing And Bag of Words Model

Term Frequency Weighing And Bag Of Words Model
24:39

In this video we talk about Inverse Document Frequency

Inverse Document Frequency
16:42

In this video we describe about TF IDF Score

TF-IDF Score
12:14

In this video we describe about Documents As TF IDF Vectors

Documents AS TF-IDF Vectors
09:57

In this video we talk about Length Normalization

Length Normalization
28:49

In this video we describe about Cosine Similarity Example

Cosine Similarity Example
24:41

In this video we tell about Computing Cosine Scores On Index

Computing Cosine Scores On Index
17:51

In this video we describe about Variants Of TF IDF Weight

Variants of TF IDF Weights
04:03
+ Evaluating search engines. User happiness, precision, recall, F-measure
12 lectures 02:31:18

In this video we describe about Term At A Time Scoring

Term at a Time Scoring
14:59

In this video we say about Efficient Cosine Ranking

Efficient Cosine Ranking
14:51

In this video we describe about Generic Approach For Speeding Up Cosine Similarity

Generic Approach For Speeding up Cosine Similarity
05:48

In this video we describe about Index Elimination

Preview 15:19

In this video we talk about Champion Lists

Champion Lists
07:48

In this video we describe about Static Quality Score

Static Quality Score
18:56

In this video we talk about High And Low Lists

High And Low Lists
02:16

In this video we say about Impact Ordered Posting

Impact Ordered Posting
07:38

In this video we describe about Cluster Pruning

Cluster Pruning
18:33

In this video we describe about Parametric Zone Tired Inde

Parametric Zone Tired Index
20:46

In this video we say about Query Term Proximity And Query Parsing

Query Term Proximity And Query Parsing
14:35

In this video we describe about How A Search Engine Works

How A Search Engine Works
09:49
+ Advertisement Systen. Google AdSense. Search Engine Optimization
5 lectures 01:31:28

In this video we say about Search Engine Evaluation Part

Preview 16:53

In this video we describe about Search Engine Evaluation Part 2

Performance of a Search Engine Part 2
22:50

In this video we describe about Search Engine Evaluation Part 3

Performance of a Search Engine Part 3
10:44

In this video we describe about Search Engine Evaluation Part 4

Performance of a Search Engine Part 4
16:27

In this video we say about Search Engine Evaluation Part 5

Performance of a Search Engine Part 5
24:34
+ Supervised Learning. Text Classification. Naive-Bayes Text Classification
4 lectures 01:40:41

In this video we describe about ECommerce Vs. Traditional Businesses

ECommerce Vs. Traditional Businesses
10:21

In this video we describe about Pricing Models For Online Advertisement

Pricing Models For Online Advertisement
39:44

In this video we describe about Ad Words Ad Sense

AdWords and AdSense
32:39

In this video we describe about SEM And SEO

SEM And SEO
17:57
+ Link analysis. Web as a graph. PageRank
9 lectures 02:49:15

In this video we describe about Introduction To Classification

Classification System
20:29

In this video we talk about Document Classification

Document Classification
07:53

In this video we describe about Manual Classification Methods

Manual Classification Methods
04:54

In this video we talk about Naive Bayes Classifiers

Naive Bayes Classifiers
56:18

In this video we talk about Bayes Rule Of Text Classification

Preview 17:23

In this video we describe about Classification Methods

Various Classification Methods
03:27

In this video we describe about Example Of Multivariate Bernouli Model

Example of Multivariate Bernouli Model
27:33

In this video we describe about Second Version Of Naive Bayes

Second Version of Naive Bayes
10:37

In this video we talk about Workedout Example Of Second Version of Naive Bayes

Example of Second Version of Naive Bayes
20:41
Requirements
  • Knowledge of probability and linear algebra.
  • Good grasp on graduate level algorithms.
  • Experience with a programming language ( C, Python, Java)
Description

The goal is to introduce various techniques required to build an IR System. In this course we will explore various methods to solve big data problem. We will evaluate alternative solutions and trade offs. In the later part of the course we will discuss various data mining algorithms to make sense of massive data sets.

Who this course is for:
  • Big Data Enthusiast
  • Data Scientists