String Representation in Assembly Language

xor pd
A free video tutorial from xor pd
Low level technology training
4.0 instructor rating • 2 courses • 23,744 students

Lecture description

In this video we explain how to represent text strings inside our own assembly programs. We also introduce two schools of strings: Length prefix and Null terminated strings.

Learn more from the full course

Assembly Language Adventures: Complete Course

Learn the language of your computer

29:09:32 of on-demand video • Updated November 2019

  • Learn to code on the x86 Architecture using Assembly Language
  • Gain solid understanding about low level concepts.
  • Understand how your computer works
  • Become a tough person
English [Auto] Hi welcome to another lesson about today's topic is Lengel presentation objectives of this lesson where we learn how to represent text strings inside our own assembly code. Text how to express hello world using ASCII as you've probably guessed. We just have to go to the table and replace it with simple with the correct number. We're going to get something like this by the way all the numbers that you say are in base 16 or hexadecimal. We call this wonderful implementation string but it's all just a string. But how can we know the size of this thing. This is mostly an important question. When you use names you know curve up because we want to know how much space to allocate for stealing. Or maybe if you want to copy the slate from one place to another place we want to know how much space we're OK in the other place in the destination so that any time you should know how long is the slip and there is some kind of a date of you know the size of Fistric they're all basically two schools of indicating the size of. When stored in stopping memo. In fact. Well no but basically there must be two. The first one is called the length Palafox all the postcard style in this mess so we like the size of the slate on the first. But so long is going to be entered as follows. The first bite is one with five which represents the amount of bytes you. Because we have five letters. One two three four five. So the first batch is going to be the next and the next five bytes. I want to be the actual state. So by reading the first byte we can you know the size of the slate can start all the length. Plus the style is the systolic or null terminated string. In this style the string ends all culminates with the little color which is zero. Probably remember from the table is the first column that is represented now as those five numbers which represent h l l. And then the color or the zero terminates. It has many names. These are basically the two ways to represent straight representations systems which are more complex for example least one use. And there are even more in the future you will get to see some of those other presentations but they really aren't going to speak about those two some person call them of each missile defense or fix some goals that delays can be calculated quickly. We just have to find out what is inside the first but because the first virus has the top or let's important one of the last call or fix the size of this thing can be limited. For example if the next fix is of size one bite then the maximum size of this thing is going to be two hundred and fifty five cockpits which is not that long slim we can for example extend this vacation if we chose the less perfect it's going to be for whites for example then we could have these things for very long sats we couldn't even choose for example a person of size 64 bits of bytes which is linear a lot. We will probably never have a slate of such sites. Let's take a look at another nation an important place. This message is that the size of this thing is virtually unlimited as long as we didn't see another Terminator. We just keep building this thing so it can be of any size. But there are also important points. It takes longer time to calculate less. For example if you want to find out the lengths of a non-terminating thing I have to go over all the Cocteau's of this thing until I find the zero terminal which means I have to wait the first thing just to find out the size and some security issues streets that are pleasant to do you think the nomination Messel can have no ending. Basically if you forget to put the zero Terminator or maybe some crews in the zoo were terminated. Is that just keeps going until the next Zilker when it goes pop. And this could cause some kind of security issues. Well I want to get into this now but you should have this in the back of your mind in this post what must you is null terminated strings. And this is because of the following with its Windows API functions expect non-terminating Slick's which means that it will be easier if we all of that to work with not only that is missing is that the special assembly instructions to deal with null terminated objects. I can tell when it is over. I want to show you those instructions and we have to work with the United States to work with those instructions but dont feel obligated to just what we can say if you want to buy something in a different way. This is something that you can do whatever we want. It is just to me that is going to work with not so many good things. The states the conditions our quiver. Let's take a look at some declinations costly well inside the data section here. In those five. The first one is a very simple one. We just use the be all that the bytes operate or to define this thing. Hello no. Might be surprised that I can do something like this because so far we have used the DVOA go with numbers with bytes. The assembler understands that this is a string and we know to convert it to separate bytes it will convert each Cocteau's for example the H cocktail will be converted to this number 48 and the cocktail will be converted to 65. And so there are some other ways in which we can declare the same state. Let's look at the other examples. Link Number two is the club mostly the same like the first one but we use different kind of quotes you use double quotes. And then we use single He doesn't matter. You can choose whatever you like the exact thing and obviously he will just use numbers. You can do this if you want. Usually sure it is not very comfortable in my opinion. This is much needed but maybe in some cases you might want to use numbers so you can use numbers. Take a look at the end of all those slaves that you have seen so far. First they contain the zero Terminator in the end. It is your responsibility as an assembly programmer to put the Terminator in the end. If you don't say will just continue. For example if I didn't put this Zihuatanejo Tohil the same one will actually be world hello there would be no separation between those who selects and I look up at which is what can be done. Flexibilty I can write for example only part of this thing and then continue on the next line. We're not going to have a new line cocktail because I did this is just exactly the same or different. All those things were put into exactly the same thing. If I want I can just lock it in the middle I can break it to parts if I want to. I just have to make sure the tape with the Z won't come in there. No that I didn't put any minute because this is the middle of the street and I looked up and which is a bit strange. I like hello. And then I read the number 20 which means space and then I. Well and they finish with the 0 Dominical 20 just means space newline character. Example. Let's assume that you have this text which is a bit more complex because it is made of more than one line working in the rhythms of politeness passing the wall of the ball down into. So this is kind of very exciting so we can say that we have two new line sequences. We have one here which the to the new line and we have another one here. Usually we don't do fall to the end of the text as a new line will be just zero terminated here in the end of the text. Different presentations in different operations systems or in different systems and you learn in different ways. In Windows a new line is marked by the sequence B and then a. Well maybe if you want. So ten and then 10 in Boston in Linux in your line is now by a use a different operation system which is not Windows or Linux. Check how the new line is represented in your system. Sunstroke no doubt. Why would the two symbols to represent just one a deal. One line historically those symbols the way many symbols maybe codes will use to represent an idea of how to put them in the ASCII code was initially used for communication. It had directions for the intel about how to plant the data so the mains can you can all see how it please returned to the beginning of the current line. It is a command for plate and F all Langfield which is a main advance the paper one line for what. So basically if this is a pointer of a pointer and the end of the line on the page you wonderful. This is the page in the interval. Once the point is it follows the directions that it has. So assume for example that same sequence. The way he does want to first take the cage which is at the end of the line to the beginning of the line. And then it is going to advance the paper one line what because if the Lancey basically we get from here which is the end of the previous line the beginning of the new life so historically this sequence may go to the beginning of the line and then go to the next line which is just you are this is just how it was when Doswell physical directions. Now we have software and we don't have to move a carriage to the beginning of the line. This is just in any story I did was somehow preserved in Windows. You still used those quotes to represent in your last stand but this is how it is the constraints with as you have probably guessed we just have to add those two numbers the way. For example if we want to pause and play a song we just have to add new line quotes in the correct places. So we have the first line in that setting and then which is one way to do it. This is the best presentation. This is the basic representation right after the second line. And finally we have the last line and we just end up with not a. We don't have to use again and you love because we just want to end the text. Either example which shows that we don't have to split the lines you know. So for example we can just put the first line and then end the end which means you line and then the second lot we don't have to separate the lines you know own so it's called some of what we've seen this lesson. There are two basic ways to indicate the size of the string less plastics which is the Pascal style and ultimate nation which is the system we are mostly going to work with not to mention in this course slaves are killed using the B that the syntax just like we have declared separate rights. All right. And your line is presented as the sequence the way in the Windows operation system just like in the linux is system exercice you will see an assembly source file in the exercise and assemble this file and then open the output which will be a file using a hex editor. Finally inside the head. So the thought identified the stakes. Just take that source file and count it against what you see in the hex editor and try to identify these things. This is a pretty basic exercise just to make sure that you understand what's there the commercial of this thing turns into in the final by not just. Have fun and see you soon the next list.