
If you're running into issues launching a notebook, it's because AWS no longer supports Studio Classic. See the NewSageMakerStudio.mov within the resources section for this lecture on how to navigate to a Jupyter NB in the new UI (it's just through JupyterLab now). You may want to select a larger instance size on launch.
Additionally, the AI space moves fast and I encourage anyone taking this course to check out the latest chatbot model leaderboard and embeddings leaderboard . Swap in the latest high performing models from the leaderboards into the RAG pipeline you build in this course. If you find this course difficult, check out Google's Notebook LM, a nice UI that you can easily upload source docs to, which runs (vector based, as of today) RAG under the hood. If you want to improve on the RAG pipeline we build in this course even more, beyond just testing out new models, I suggest testing Graph based RAG, which is said to be more performant than vector based RAG (which this course uses). I have yet to test Graph RAG myself but definitely want to!
Code for creating text embeddings and the corresponding vectorstore using Langchain with Amazon Titan Text Embeddings, or other HuggingFace/ private embeddings models (walks through the HuggingFace leaderboard).
Note: If using the Claude 3 Opus model, uncomment the code within the # To use Claude 3 Opus: cells within your jupyter notebook, Chatbot_u.ipynb
Build and deploy a LLM-based chatbot to answer questions using your private dataset!
To build, we will use Anthropic’s Claude 2 LLM on Amazon Bedrock and Langchain, with RAG implementation. I’ll also show you the code for using other LLMs (OpenAI’s ChatGPT, GPT-4, etc.), in place of Claude 2, as well as other embedding models in place of Amazon Titan Text Embeddings. To deploy, we will use Gradio.
This is an end to end build of a chatbot solution that can be used within your organization. Use this Generative AI solution to improve things across your organization like; enhance customer support, streamline information retrieval, aid in the training and onboarding of new employees, promote data-driven decision making, customize insights for clients and customers, and enable efficient knowledge sharing.
I also cover programmatic audio/ video transcription within the data collection step. This can be applied to other use cases within your organization, outside of the chatbot. E.g. transcribe audio/video content to build content recommendation models, etc.
This course is best for those with mid to senior level Python and Data Science understanding. For more beginner levels, feel free to dive in and ask questions along the way. Hopefully you all enjoy this course and have fun with this project!