Spark NLP for Data Scientists
What you'll learn
- Utilize 20,000+ State-of-the-Art NLP models in 200+ languages
- Train & tune your own NLP models by leveraging the Spark NLP's pre-defined classifier architecture on your own datasets
- Perform popular NLU tasks in one line of code - like generate texts, summarize texts, answer questions
- Deploy models as API's with NLP Server, a Docker container that contains all Spark NLPs capabilities
Requirements
- Hands-on understanding of Python is needed
- Recommended: basic understanding of machine learning and natural language processing
- Nice to have: basic understanding of Apache Spark
Description
Welcome to the Spark NLP for Data Scientist course!
This course will walk you through building state-of-the-art natural language processing (NLP) solutions using John Snow Labs’ open-source Spark NLP library. Our library consists of more than 20,000 pretrained models with 250 plus languages. This is a course for data scientists that will enable you to write and run live Python notebooks that cover the majority of the open-source library’s functionality. This includes reusing, training, and combining models for NLP tasks like named entity recognition, text classification, spelling & grammar correction, question answering, knowledge extraction, sentiment analysis and more.
The course is divided into 11 sections: Text Processing, Information Extraction, Dependency Parsing, Text Representation with Embeddings, Sentiment Analysis, Text Classification, Named Entity Recognition, Question Answering, Multilingual NLP, Advanced Topics such as Speech to text recognition, and Utility Tools &Annotators. In addition to video recordings with real code walkthroughs, we also provide sample notebooks to view and experiment. At the end of the cost, you will have an opportunity to take a certification, at no cost to you.
The course is also updated periodically to reflect the changes in our models.
Looking forward to seeing you in the class, from all of us in John Snow Labs.
Who this course is for:
- Data scientists who are looking to use Natural Language Processing at Scale
- Data scientists looking to build custom natural language understanding applications
- Data Analysts who want to apply about Natural Language Processing
Instructors
We here at John Snow Labs are working to support natural language processing (NLP) tasks in the healthcare, legal, and finance industry. We provide the state-of-the-art algorithms for our clients so they can augment their work with the richness of text data. We offer NLP Summit every quarter, where we also host trainings on how to use our product and library packages. We hope to provide more accessible learning experience for everyone who are interested in NLP.
David Talby is the Chief Technology Officer at John Snow Labs, helping companies apply artificial intelligence to solve real-world problems in healthcare and life science. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK. David holds a Ph.D. in Computer Science and Master’s degrees in both Computer Science and Business Administration. He was named USA CTO of the Year by the Global 100 Awards and GameChangers Awards in 2022.
Jiri Dobes is the Head of Solutions at John Snow Labs. He has been leading the development of machine learning solutions in healthcare and other domains for the past five years. Jiri is a PMP-certified project manager.
His previous experience includes delivering large projects in the power generation sector and consulting for the Boston Consulting Group and large pharma. Jiri holds a Ph.D. in mathematical modeling.
Lead Data Scientist and ML Engineer having a decade long industry experience. Currently having my PhD in CS at Leiden University (NL) and holding an MS degree in Operations Research from Penn State University (USA).
I have worked as CTO, Head of AI, Principal Data Scientist and various other titles so far and I have also provided hands-on consulting services in Machine Learning and AI, statistics, data science and operations research to the several start-ups and companies around the globe.
In Leiden University, I give lectures in Big Data Architecture, Distributed Data Processing and Automated ML. Besides, I'm the instructor and course planner for "Intro to Python and Machine Learning Toolkit" and "NLP with Python: From Zero to Hero" at several online venues.
I also speak at Data Science & AI events, conferences and workshops. So far, I have delivered more than a hundred talks at International as well as National Conferences, Meetups. Feel free to drop me a line if you want to invite me.