Power BI Demo: Working with Text Tools

A free video tutorial from Maven Analytics
Award-Winning Analytics & Business Intelligence Training
Rating: 4.6 out of 5Instructor rating
30 courses
812,222 students
Power BI Demo: Working with Text Tools

Lecture description

In this lecture we'll cover Power BI Desktop tools designed specifically to work with text, including split column, format column, or trim and clean.

Learn more from the full course

Microsoft Power BI Desktop for Business Intelligence

Master Power BI Desktop & learn advanced Power BI analysis & data visualization w/ a top Microsoft Power BI instructor

10:42:59 of on-demand video • Updated August 2022

Build professional-quality business intelligence reports from the ground up
Blend and transform raw data into beautiful interactive dashboards
Design and implement the same B.I. tools used by professional analysts and data scientists
Showcase your skills with two full-scale course projects (with step-by-step solutions)
Understand the business intelligence workflow from end-to-end
Learn from a best-selling instructor and professional BI developer
English [Auto]
Next up, we've got text specific tools and looking at the Transform tab in the Query Editor. You'll notice that Power BI groups different sets of tools together based on their purpose or function. So in this case, we'll find all of our text specific tools grouped together at the end of the ribbon. And within this group, we have some really interesting, powerful options. For one, we can split up a column based on a specific character or delimiter or based on a number of characters. We can format any of our text columns using basic formatting options like lowercase, uppercase, proper case, which is capitalizing the first letter of each word. Or we can use tools like trim, which eliminates leading and trailing spaces, or clean, which does the same thing and also eliminates non printable characters. Now you might be thinking that those trim and clean options really aren't that helpful until you run into the case where you have one trailing space in your data set. And trust me, this will drive you crazy the first time you experience it until you figure it out. And the problem is, as human beings, we are incapable of seeing a trailing space. It's completely invisible to us and it looks exactly the same as another data point without that trailing space. But to excel, or in this case to power by it looks like a completely different and unique value. So in cases like that, trim or clean can be great tools just to standardize and help avoid issues like that, especially if you're working with really messy or unstructured text data. We also have some great extract tools. We can extract a certain subset of characters from a string based on a specified length. We can extract the first or last number of characters or a range, but where it gets really interesting is using data limiters so we can tell power. By that we want to return all of the characters before a specific delimiter or symbol or character or after it, or between two distinct. MM. And there are also some advanced options as well that allow you to specify whether you search from the left side of the string or the right. And if you want to skip a certain number of instances of the delimiter before returning text. So some really great flexibility there with those extract tools. So you may have noticed that some of the tools here are grayed out and inactive like merge columns or paths. And that brings up a really important point, which is that this toolbar is completely dynamic based on what you've selected. So if you've only selected a single column, the merge columns option is irrelevant, so you can't even click it. You'd have to select multiple columns in order to activate that option and then taking that even further. If you've selected a column that's numerical instead of text, this entire group of tools might be replaced by a number of specific tools instead. So just remember that the entire ribbon, the entire toolbar that you're seeing here will dynamically change based on your selections. Now, one more very important point before we shift gears into power. By any time you see this yellow box that says, Hey, this is important, you guessed it, I'm about to talk about something important. So that means focus in, pay attention, stop playing with the dog or checking Facebook because this is something that is going to come up time and time again in the course. And it's really, really important for you to fully grasp and understand. So what I want to talk about now is the difference between transform and add column, because this is something that confused me for longer than I care to admit. And what I was noticing was that the same tools, in fact the same identical sets of tools, kept popping up in different places. And for the longest time I just thought this was really confusing and redundant until I realized that the outcome is completely different depending on where you select that tool. So when you select a tool from within the Transform tab, you're essentially modifying or overwriting the column that you've selected. But when you choose a tool from the Add column tab, you're creating a brand new column to your table. So that may sound really obvious when I say it now, but I guarantee as you're learning this tool and as you're playing around with the query editor, you will at some point in time select the tool from the wrong tab. And that's okay, because obviously nothing's set in stone. It's as simple as just deleting that last applied step and you're back where you started, but it is something to keep in mind and hopefully that will help you at least understand where you're going wrong and why these tools appear in multiple places. So let's open up Power BI and get our hands dirty with some of these text tools. All right, so here I am in my Adventure Works report. Go ahead and open it up as well. To head back into the get data option. Click text csv again. This time instead of products. Let's go into adventure works customers and double click. Here. We've got our preview. All of these default settings look good so I can go ahead and click edit to launch the query editor. All right. So first things first. Here's our product lookup query that we created in the last lecture and now we see a second for adventure works customer. So let's go through those two steps that we do every time we get in here to the query editor, we're going to start with the table name. And for the sake of consistency, let's go with a W underscore whoops customer lookup. The last file that we loaded was called product lookup and it was called product lookup for a reason because it had all sorts of information about individual products where each row represented a unique product key and each column represented some attribute about that product. Now similarly here we're looking at customer data instead, so I can press enter LOC in my table name. And as I look in the preview here, as you can see, I've got a customer key and then all sorts of attributes about my customers. So let's go ahead and finish our quick queue. We've got column headers that look correct. Let's look at the data values very quickly are key is a whole number which is good. I've got names which are strings or text fields which are good. I've got a birth date column, which is a date represented by this little calendar icon in the top left. And then as we scroll through. Marital status. Gender. Email. All. Text. This one's interesting and actually recognized the annual income field as a currency or fixed decimal. That's fine. We can leave it as is. And then we've got total children. Education level, occupation and homeowner. Yes. No flag, all text. So check those boxes. Everything looks good. And now we can proceed to our table transformations. So first things first, let's scroll over. I've got my eye on these these name columns because they're all uppercase and that looks silly to me. So I'd like to make those proper case instead. So let's go ahead and select the prefix column, go into add column, we'll find our text tools here. And what we want to do is go into the format tools and click capitalize each word which is proper case. And when we do that, oh, it created a new column here at the end of our table, which is essentially a duplicate version of our prefix column, but with the correct capitalization. And you can see that the header name has been defaulted to capitalize each word, which is just totally ridiculous. And as you probably have guessed by now, the reason this happened was because we accessed that format tool through the add column tab when what we really wanted to do was simply overwrite the existing column and transform it. So we should have accessed the formatting tools through the Transform tab instead. And so that's what I'm talking about. It's very easy to make these kind of mistakes, so you really just have to be vigilant about knowing which tab you're in each time you create a calculated column. But no worries. The good news is that it's a very easy fix. You can't control Z to undo here. The equivalent is just going into your applied steps and pressing the X to remove that last step that we just created. So now I'll scroll back. Select the prefix column again, make sure that I'm in my transform tab and go through that same process format. Capitalize each word. And there you go. It's overwritten the values into the format that I'd prefer. Now, to do the same thing to the first and last name columns, I could do them individually, or I can select one. Hold the shift key. Click the other. That will grab both of them at the same time. And let's do the two for one special. Capitalize each word. So now I have a nice clean prefix, first name and last name. So that's helpful. But I'd really like to have the customer's full name accessible in a single field as well. So what that means is that I'd like to merge all three of these columns together and separate them with a space to create a brand new column that I'll name, full name. So what I've done is select prefix. I've held shift selected last name to grab all three and instead of clicking merge warning signs, we're still in the Transform tab. I need to make sure to go into add column instead and access the merge tool from here. So when I click Merge columns, it gives me a little dialog box with some options. In this case, I do want a separator in between the values. I want that to be a space. And let's name our new column full name. So a press OC. And you can see it's added a new step when I scroll to the right. There we go. We've got our new full name column. Looks good. We've got spaces between the names and now. One thing to note, it's kind of a side note. It added this step, but it kind of gave it a default name that says inserted merged column. Now, as I click through, it's pretty easy to see exactly what column was merged and what was added. But if we want to make this even more clear, I can always right click and rename that step. And instead of just saying inserted merged column, I can actually type full name. That just makes it a little bit more explicit what's going on. I inserted a new column and it's called Full Name. All right. So next up, I want to make a couple more changes. I want to really focus on this email address column here. And the first thing that I want to do is extract the username from this email. But the problem is I can't just say give me the first five characters, because that would work for John 24, but not Eugene ten. So it's got to be a dynamic number of characters from the left side of this email address based on the location of that add symbol. So this is where we're going to use that extract based on delimiter option. So let's go ahead and select email address. I want to add a new column for username and I'm going to use the extract tools here. So again, not just a default length, not an explicit number of characters, but I want the text before a delimiter. So here you go. All I need to do is say the delimiter that you're looking for. Is that aside, if we drill into the advanced options. This is where you can say search from the left or search from the right. You can skip a certain number of instances of the delimiter. In this case, we don't need any of that. All we want is the text before the app sign. So Press OC added a new column. As you can see, it didn't let me name the column at the same time, but it did give me the values that I want. So what I can do here is just double click the column header and I can give it a name. Let's call it username. Now let's do one more similar example, but with one more complication. So going back to our email address, what if this time, instead of the username, we want to return the domain name and the domain is anything that falls between the at symbol and the dot com at the end of the string. To do that I can use the same extract tools, but this time I want the text between two de limiters so I can go into this option here and I can say my start delimiter is the at sign and my end delimiter is either the period or it can type dot com. Both will do the same thing and when I press OC same exact idea, it's given me a new column with the string or the characters between those two. Mm. I can change the column title to something like domain and Press enter and there you have it. So very, very easy way to do some pretty complex stuff. So one last adjustment to make to this domain column here. I want to get rid of the dash and replace it with a space. So to do that, I'm going to transform the existing column, not add a new one. And you'll see this other grouping of tools that say any column that means they're not text or number or date specific. They can be applied in a number of different circumstances. In this case, I'll find the replace tool right here in this set. So it's as simple as it sounds. I can find the dash and I can replace it with a space press. Okay, now I've got adventure. Space works, and why don't we go ahead and format that? Capitalized in proper case and we are all set. So there you go. We use the formatting options. We added columns, we transformed columns. We use the extract tools based on mm. We merged columns. Some really good examples of how these different text functions can be used to enhance and clean and transform a data table. So I think we're all good with customer lookup. The only thing left is to go back to our home tab and click close and apply to load that data into our file. And as soon as that loads up, we'll see that it's going to appear as another object right here in our relationships tab, right next to our friend product lookup. We'll also see it available in our field list, accessible in both the data view and the report view. So there you have it. Query editing. Text Tools.