Visualizing Named Entity Recognition

A free video tutorial from Jose Portilla
Head of Data Science at Pierian Training
Rating: 4.6 out of 5Instructor rating
59 courses
3,231,996 students
Visualizing Named Entity Recognition

Learn more from the full course

NLP - Natural Language Processing with Python

Learn to use Machine Learning, Spacy, NLTK, SciKit-Learn, Deep Learning, and more to conduct Natural Language Processing

11:21:51 of on-demand video • Updated September 2019

Learn to work with Text Files with Python
Learn how to work with PDF files in Python
Utilize Regular Expressions for pattern searching in text
Use Spacy for ultra fast tokenization
Learn about Stemming and Lemmatization
Understand Vocabulary Matching with Spacy
Use Part of Speech Tagging to automatically process raw text files
Understand Named Entity Recognition
Visualize POS and NER with Spacy
Use SciKit-Learn for Text Classification
Use Latent Dirichlet Allocation for Topic Modelling
Learn about Non-negative Matrix Factorization
Use the Word2Vec algorithm
Use NLTK for Sentiment Analysis
Use Deep Learning to build out your own chat bot
English [CC]
(calm music) -: Welcome back to this lecture on visualizing named-entity recognition. Let's review how to visualize NER with spaCy and displaCy. Let's head over to a Jupyter Notebook. Okay, here I am at a Jupyter Notebook and I've imported spaCy as well as loaded up the English library and I said, from spaCy import displaCy. Next what we're going to do is create a document that we actually want to visualize. We'll say NLP and here we're gonna say over the last quarter, Apple sold nearly 20,000 units, or let's say 20,000 iPods, for a profit of 6 million. Next, what we're gonna do is after we define that document, we'll call displaCy and we will render the doc and then style, say ENT for entity, and since I'm using Jupyter, I'm going to say Jupyter is true. And here you can see spaCy, to the best of its ability, is going to highlight and color the entities it finds. So it found the date entity, an organization, a money, iPods as product, cardinal for some sort of amount, nearly 20,000, and so on. So you can see already the power that displaCy has to highlight entities. Now, let's imagine you want to have multiple lines and you wanna view them line by line. So for example, we're gonna create one more line inside of this documents. In fact, we're just gonna pass in another string. So inside these parentheses, go ahead and create a new string. And we'll say something like by contrast, Sony only sold, we'll say 8,000 music players, and let's say Walkman music players. (light mouse clicking) Okay? So if you actually call render now on the document, it'll show everything as one giant, long string but maybe you only wanna do this line by line. The way you do that is just separate out into sentence segmentation by saying four sent in, doc.sents, and then you're gonna call displaCy. Render individually on each of those by saying NLP sent.txt, passing the text of each individual sentence segmentation. Again, make sure you say style. Whoops. Style is equal to ENT. And since I'm running this in Jupyter, I will say Jupyter is true. And here you can see, it gives you a little more space in between the lines. It's not too different between rendering everything versus this, but you can see that each new line is left indented instead of, by contrast, it's just essentially a continuation. Okay, now you can also have different options, remember there's an options dictionary. You can actually choose options for things like coloring or customized effects. And more importantly, you can choose options to only display or highlight certain entities. So maybe you're only really interested in product entities. What you can do is say options, and under ENTS key, you can pass in a list of what you're interested in. So maybe we're only interested in product entities. Then we're going to render this, we'll render the whole thing. Let me copy this and paste it down here. And then we'll say options is equal to options. Run that, and then it will only highlight the entities that you said here. And if you want to add another one like organization, you simply just add to this list of options and then it's gonna highlight that. So that way maybe you're not interested in a date entities so you don't want those highlighted, you can easily filter out whatever entities you want. Then as always, there's things like different colors. So you can actually choose different colors for different entities. There's the default colors, but if you really wanna choose your own, all you need to do is create another dictionary called colors. And then what you're gonna do is choose an entity and then you're gonna pass in a color that you want. So for example, you can say that you want this entity to be red. Then once you've done that, all you need to do is inside your options, state a colors key and set it equal to, or set its value equal to this colors dictionary. So if you rerun this, you'll notice now that organizations have been red, and as always you can pass in your own custom hex codes in case you want a color that isn't simple. So for example, I can pass in this hex code color and it'll create something that looks like that. You can also customize these even moreso. You can actually add linear gradients or radial gradients. So let me show you examples of that. And I'm actually just going to copy and paste these from the notes we provide. So there's a radial gradient. So all you need to do is choose the tag and then say radial-gradient and choose two colors, either hex codes or string color codes, and if you run that, what's gonna happen is it's gonna do a radial gradient. The inside is gonna start yellow and then the outside is gonna go to green. So for example, we could say from yellow inside to outside red, and you get something that looks like that. You can then also do a linear gradient. So you just call the linear gradient function and then choose a start, middle and end color. Let me copy and paste that. So here we're gonna choose linear-gradient and you can choose an actual degree and then a start and stop color. So you can run that and it's going to start at this darker purple and go to this pink. Again, you can just pass in string codes here if you want. You can say go from maybe orange to red. Run that. You can see it going from orange to red. And you can change these to 180 degrees if you want it to go from top to bottom or so on. Or you can even choose, if you really want, a 45 degree and have it go from one corner to the other. It's really up to you. Usually you won't need to customize color effects that often unless you have a very specific style that you're looking for. Finally, we've showed you how to render everything inside of the Jupyter Notebook but maybe you're running a dotPy script. As always, you can simply call displaCy, serve, pass in the document, pass in style is equal to ENT, and if you want, you can still pass in options. And then now notice I'm not saying Jupyter is equal to true. Instead just saying serve this. And when you run that, you should see it being served. So you just need to go to 127.0.01 at whatever port it's telling you. So in my case, it's at port 5,000. You copy and paste that and then you'll see the being rendered at that location. All right, so that's really it for rendering. The most important thing to note here, I believe, is not really the coloring but the fact that you get to choose and filter based off what entities you want. So you can always have this ENTS and then choose specific entities you wanna highlight. Thanks and we'll see you at the next lecture.