
We were using a SQL database to showcase how vectors can be stored and retrieved. SQL is no way the right tool to index vectors and map similar vectors together. At the end of the lesson you will see that I am using a SQL query like this to retrieve similar vector. I used this following command,
SELECT vector FROM vectors ORDER BY abs(vector - ?) ASC
But in order to find the actual distance between vectors we need to calculate the Euclidean distance. This can be achieved python/numpy easily but in SQL it is little bit complicated because you have to essentially do the math using SQL on binary data. That complex SQL is out of scope of this.
As expected because I didn't take the Euclidean distance between vectors to find which one is closer to our query_vector we ended up having a wrong vector.
Here is the way anyone would calculate in python:
import numpy as np
vect1 = np.array([1.2, 3.4, 2.1, 0.8])
vect2 = np.array([2.7, 1.5, 3.9, 2.3])
qry_vect = np.array([1.0, 3.2, 2.0, 0.5])
d1 = np.linalg.norm(vect1 - qry_vect)
d2 = np.linalg.norm(vect2 - qry_vect)
then sort d1 and d2 to find the closest vectors.
you will find [1.2, 3.4, 2.1, 0.8] is the closest to our query_vector instead of the other one showed in the video.
To do the all these in a vector we need a true vector database which is the topic of subsequent sections.
If you are using older pinecone (v-2.2.1) this video tutorial is still relevant. If you upgraded to later version please skip this one and see the next lesson.
You can skip this one if you have already completed it in previous section.
This lesson was prepared using Pinecone version 2.2.1. For newest version (v5.0.0) please refer to the attached notebook. Only initialization syntax got changed.
In this comprehensive course on Vector Databases, you will delve into the exciting world of cutting-edge technologies that are transforming the field of artificial intelligence (AI), particularly in generative AI. With a focus on Future-Proofing Generative AI, this course will equip you with the knowledge and skills to harness the power of Vector Databases for advanced applications, including Language Model Models (LLM), Generative Pretrained Transformers (GPT) like ChatGPT, and Artificial General Intelligence (AGI) development.
Starting from the foundations, you will learn the fundamentals of Vector Databases and their role in revolutionizing AI workflows. Through practical examples and hands-on coding exercises, you will explore techniques such as vector data indexing, storage, retrieval, and conditionality reduction. You will also gain proficiency in integrating Pinecone Vector Data Base with other tools like LangChain, OpenAI API using Python to implement real-world use cases and unleash the full potential of Vector Databases.
Throughout the course, we will uncover the limitless possibilities of Vector Databases in generative AI. You will discover how these databases enable content generation, recommendation systems, language translation, and more. Additionally, we will discuss performance optimization, scalability considerations, and best practices for efficient implementation.
Led by an expert instructor with a PhD in computational nano science and extensive experience as a data scientist at leading companies, you will benefit from their deep knowledge, practical insights, and passion for teaching AI and Machine Learning (ML). Join us now to embark on this transformative learning journey and position yourself at the forefront of Future-Proofing Generative AI with Vector Databases. Enroll today and unlock a world of AI innovation!