Power BI Demo: Working with Text Tools

Maven Analytics
A free video tutorial from Maven Analytics
Award-Winning Analytics & Business Intelligence Training
4.6 instructor rating • 23 courses • 565,685 students

Lecture description

In this lecture we'll cover Power BI Desktop tools designed specifically to work with text, including split column, format column, or trim and clean.

Learn more from the full course

Microsoft Power BI Desktop for Business Intelligence

Master Power BI Desktop & learn advanced Power BI analysis & data visualization w/ a top Microsoft Power BI instructor

10:43:16 of on-demand video • Updated September 2021

  • Build professional-quality business intelligence reports from the ground up
  • Blend and transform raw data into beautiful interactive dashboards
  • Design and implement the same B.I. tools used by professional analysts and data scientists
  • Showcase your skills with two full-scale course projects (with step-by-step solutions)
  • Understand the business intelligence workflow from end-to-end
  • Learn from a best-selling instructor and professional BI developer
English [Auto] Next up, we've got text specific tools and looking at the Transform tab and the query, Ed, you'll notice that powered by groups different sets of tools together based on their purpose or function. So in this case, we'll find all of our text specific tools grouped together at the end of the ribbon. And within this group, we have some really interesting, powerful options. For one, we can split up a column based on a specific character or delimiter or based on a number of characters. We can format any of our text columns using basic formatting options like lowercase, uppercase, proper case, which is capitalizing the first letter of each word. Or we can use tools like Trem, which eliminates leading and trailing spaces or clean, which does the same thing and also eliminates non principal characters. Now you might be thinking that those trim and clean options really aren't that helpful until you run into the case where you have one trailing space in your data set. And trust me, this will drive you crazy the first time you experience it until you figure it out. And the problem is, as human beings, we are incapable of seeing a trailing space. It's completely invisible to us and it looks exactly the same as another data point without that trailing space. But to excel or in this case to power by it looks like a completely different and unique value. So in cases like that, trim or clean can be great tools just to standardize and help avoid issues like that, especially if you're working with really messy or unstructured text data. We also have some great extract tools. We can extract a certain subset of characters from a string based on a specified length. We can extract the first or last number of characters or arrange. But where it gets really interesting is using delimiters so we can tell power by that. We want to return all of the characters before a specific delimiter or symbol or character or after it, or between two distinct delimiters. And there are also some advanced options as well that allow you to specify whether you search from the left side of the string or the right, and if you want to skip a certain number of instances of the delimiter before returning text. So some really great flexibility there with those extract tools. So you may have noticed that some of the tools here are great out inactive like merge columns or pass. And that brings up a really important point, which is that this toolbar is completely dynamic based on what you've selected. So if you've only selected a single column, the merge columns option is irrelevant. So you can't even click it. You'd have to select multiple columns in order to activate that option. And then taking that even further, if you've selected a column that's numerical instead of text, this entire group of tools might be replaced by a number of specific tools instead. So just remember that the entire Reben entire toolbar that you're seeing here will dynamically change based on your selections. Now, one more very important point before we shift gears into power. By any time you see this yellow box that says, hey, this is important, you guessed it, I'm about to talk about something important. So that means focus in, pay attention, stop playing with the dog or checking Facebook, because this is something that is going to come up time and time again in the course. And it's really, really important for you to fully grasp and understand. So when I want to talk about now is the difference between transform and add column, because this is something that confused me for longer than I care to admit. And what I was noticing was that the same tools, in fact, the same identical sets of tools kept popping up in different places. And for the longest time, I just thought this was really confusing and redundant until I realized that the outcome is completely different, depending on where you select that tool. So when you select a tool from within the transform tab, you're essentially modifying or overwriting the column that you've selected. But when you choose a tool from the ADD column tab, you're creating a brand new column to your table. So that may sound really obvious when I say it now, but I guarantee as you're learning this tool and as you're playing around with the query editor, you will at some point in time select the tool from the wrong tab. And that's OK, because obviously nothing set in stone. It's as simple as just deleting that last applied step and you're back where you started. But it is something to keep in mind and hopefully that will help you at least understand where you're going wrong and why these tools appear in multiple places. So let's open up Parubiy and get our hands dirty with some of these text tools. All right, so here I am in my Adventure Works report, go ahead and open it up as well. I'm going to head back into the get data option, click text's ESV again. This time instead of products, let's go into adventure works customers and double click. We've got our preview, all of these default settings look good, so I can go ahead and click edit to launch the query Ed. All right, so first things first, here's our A.W. product look up query that we created in the last lecture and now we see a second for Adventure Works customer. So let's go through those two steps that we do every time we get in here to the query. Ed, we're going to start with the table name. And for the sake of consistency, let's go with A.W. Underscore customer. Look at the last file that we loaded was called Product Look and it was called product look up for a reason because it had all sorts of information about individual products where each row represented a unique product key and each column represented some attribute about that product. Now, similarly, you were looking at customer data instead. So you can press enter like in my table name. And as they look in the preview here, as you can see, I've got a customer key and then all sorts of attributes about my customers. So let's go ahead and finish our quick Q&A. We've got column headers that look correct. Let's look at the data values very quickly. Our key is a whole number, which is good. Got names which are string's or textfield, which are good, got a birth date column, which is a date represented by this little calendar icon in the top left. And then as we scroll through. Marital status, gender, email, all text, this one's interesting and actually recognized the annual income field as a currency or fixed decimal. That's fine. You can leave it as is. And then we've got total children, education level, occupation and homeowner. Yes, no flag, all text. So check those boxes. Everything looks good. And now we can proceed to our table transformations. So first things first. Scroll over. I've got my eye on these these name columns because they're all uppercase and that looks silly to me. So I'd like to make those proper case instead. So let's go ahead and select the prefix column, go into add column finer text tools here and we want to do is going to the format tools and click capitalize each word, which is proper case when we do that. Oh it created a new column here at the end of our table, which is essentially a duplicate version of our prefix column. But with the correct capitalization you can see that the header name has been defaulted to capitalize each word, which is just totally ridiculous. And as you probably have guessed by now, the reason this happened was because we accessed that format all through the ADD column tab when what we really wanted to do was simply overwrite the existing column and transform it. So we should have access to the formatting tools through the transform tab instead. And so that's what I'm talking about. It's very easy to make these kind of mistakes. So you really just have to be vigilant about knowing which tab you're in each time you create a calculated column. But no worries. The good news is that it's a very easy fix. You can't control Z to undo here. The equivalent is just going into your applied steps and pressing the X to remove that last step that we just created. So now scroll back. Select the prefix column again, make sure that I'm in my transform tab and go through that same process format, capitalize each word and there you go, it's overwritten the values into the format that I'd prefer now to do the same thing to the first and last name columns. I could do them individually or I can select one, hold the shift key, click the other. That will grab both in the same time and let's do the two for one special capitalize each word. So now I have a nice clean prefix first name and last name, so that's helpful. But I'd really like to have the customer's full name accessible in a single field as well. So what that means is that I'd like to merge all three of these columns together and separate them with the space to create a brand new column that I'll name full name. So what I've done is select prefix. I've held shift selected last name to grab all three and instead of clicking merge, warning signs were still in the transform tab. I need to make sure to go into add column instead and access the merge tool from here. So when I click merge columns it gives me a little dialog box with some options. In this case, I do want a separator in between the values. I want that to be a space. And let's name our new column. Full name. So, yes, OK. And you can see it's added a new step, I scroll to the right. There we go. We've got our new full name column. Looks good. We've got spaces between the names. And now one thing to note, just kind of a side note, it added this step, but it kind of gave it a default name that says inserted merged column. Now, as I click through, it's pretty easy to see exactly what column was merged and what was added. But if we want to make this even more clear, I can always. Right. Click and rename that step. And instead of just saying inserted merged column, I can actually type full name. That just makes it a little bit more explicit. What's going on. I inserted a new column and it's called Full Name. All right. So next up, I want to make a couple more changes. I want to really focus on this email address column here. And the first thing that I want to do is extract the user name from this email. But the problem is, I can't just say give me the first five characters because that would work for John. Twenty four, but not Yujin 10. So it's got to be a dynamic number of characters from the left side of this email address based on the location of that ad symbol. So this is where we're going to use that extract based on delimiter option. So let's go ahead and select email address. I want to add a new column for username. I'm going to use the extract tools here. So, again, not just the default length, not an explicit number of characters, but I want the text before a delimiter. So here you go, all I need to do is say the delimiter that you're looking for is that assane we drill into the advanced options. This is where you can say, you know, search from the left or search from the right. You can skip a certain number of instances of the delimiter. This case, we don't need any of that. All we want is the text before the assign. So, OK, adding a new column, as you can see, it didn't let me name the column at the same time, but it did give me the values that I want. So what I can do here is just double click the column header and I can give it a name. Let's call it user name. Now let's do one more similar example, but with one more complication. So going back to our email address, what if this time, instead of the username we want to return the domain name and the domain is anything that falls between the at symbol and the dot com at the end of the string. To do that, I can use the same extract tools, but this time I want the text between two delimiters so I can go into this option here and I can say my start delimiter is the at sign and my end delimiter is either the period or I can type dot com. Both will do the same thing. And when I press. OK, same exact idea. It's given me a new column with the string or the characters between those two delimiters. I can change the column title to something like Domain and Press Enter and there you have it. So very, very easy way to do some pretty complex stuff. So one last adjustment to make to this domain column here. I want to get rid of the dash and replace it with a space. So to do that, I'm going to transform the existing column, not add a new one. And you'll see this other grouping of tools that say any column that means they're not text or number or date specific. They can be applied in a number of different circumstances. In this case, I'll find the replace tool right here in the set. So it's as simple as it sounds. I can find the dash and I can replace it with a space press. OK, now I've got adventure space works and why don't we go ahead and format that? Capitalized and proper case and we are all set. So there you go, we use the formatting options, we added columns, we transformed columns, we use the extract tools based on delimiters. We merged columns, some really good examples of how these different text functions can be used to enhance and clean and transform a data table. So I think we're all good with customer lookup. The only thing left is to go back to our hometown and click close and apply to load that data into our file. And as soon as that loads up, we'll see that it's going to appear as another object right here in our relationships tab right next to our friend product, look up. We'll also see it available in our field list, accessible in both the date of view and the report view. So there you have it, query editing, text tools.