
A quick intro to this section
A file, like anything else we work with as developers, is just binary data. And its only binary data that we are sending over the wire.
Sending a file over a network involves reading the file as raw bytes, adding important details like its name and size, and combining them into a structured message. This message is then transmitted through a network connection. On the receiving side, the program reads the incoming bytes, uses the metadata to know what the data represents, and reconstructs the original file by writing the received bytes back to disk
Sending a file over a network involves several key steps: First, the file is read from storage and converted into binary data, because networks transmit raw bytes. Next, important information about the file—called metadata, such as its name and size—is added to the data stream so the receiver knows how to interpret the incoming bytes. To ensure reliable delivery, the data is usually broken into smaller chunks or packets that can be sent sequentially across the network. On the receiving side, the program listens for incoming connections, reads the metadata to understand how to reconstruct the file, and then collects all the chunks. Finally, these chunks are reassembled in order and written to disk, recreating the original file for use. This process of structuring, chunking, transmitting, and reassembling ensures files are transferred accurately and efficiently over the network.
In this lecture I want to show you my thought process when it comes to sending a file over the wire. I've just quickly coded up some python code. Please don't worry about python but rather focus on the steps required in order to send a file over the internet to another device. Once the user selects the file, we convert it to binary. We then set up custom headers to tell the receiver how to read those bytes. We can then send the message (which consists of the both the metadata and the file content itself).
A header is a small block of information placed at the start of a data stream so the receiver knows how to interpret what follows. It acts like a shipping label, with each slot describing part of the message: the first byte tells how long the filename is (so the receiver knows where it ends), the next set of bytes contains the filename itself, and the following four bytes indicate how large the file’s content is. This structure ensures both sender and receiver understand the same format, preventing confusion about where each piece of data begins and ends.
To finish off our simple example, let's take a look at how a receiver might handle an incoming file.
My (rather sloppy) code sets up a simple TCP server that waits for a client to send a file and then reconstructs it on the receiver’s end. It first creates and binds a listening socket to accept connections on a specified port. When a client connects, the server reads the first byte to know how long the filename is, then reads that many bytes to get the actual filename, and finally reads four more bytes to determine the size of the file content. It then reads the file data in chunks until it receives all the bytes, closes the network connection, and writes the complete data into a new file with the correct name on the local system.
Traditionally we had to use servers to send a file over the network. But with the wide spread use of WebRTC, we have a streamlined approached.
This section is optional, as we are only getting the project ready for files. This means we have to set up our modules, our server side code and everything else necessary before we establish a WebRTC connection or sending files.
A quick overview of our HTML and CSS. I will also include the starting files to this lecture.
In this lecture we will add our main.js file to the project.
Importantly, we will be using ESModules.
ESModules, enabled by type="module", are great for projects because they provide a standardized, modern way to organize JavaScript code into reusable, maintainable parts using import and export syntax. They support strict mode by default, help avoid polluting the global scope, and allow browsers and tools to perform static analysis for optimizations like tree shaking (removing unused code). This modular approach improves code clarity and structure, making it easier to manage dependencies, enhance scalability, and boost development productivity as projects grow.
Before we create our WebSocket server, I want to set up a state file so we can manage state. Client-side state refers to temporary data stored in the user's browser memory during a session, used to manage and track information like user IDs or WebSocket connections without persisting it to a database. In our example, the state object is going to hold session-specific data, enabling our app to efficiently manage user interactions and connections in real-time, with functions provided to update and access this data as needed during the app's runtime.
As you have seen in the previous lecture, we created client side state. This is temporary data that is stored in the browser and tracks the current user’s session and app status. However, it isn’t saved to a database or shared persistently. In our WebRTC app, the state object is going to hold essential info like user IDs and WebSocket connections locally, allowing the app to manage communication and file transfer dynamically.
WebSockets enable real-time, bidirectional communication between clients and servers. On the client side, the browser’s WebSocket API is essential for establishing and managing connections, while on the server side, a dedicated WebSocket server handles persistent communication through handshakes and message exchanges. Both components work together to facilitate seamless, continuous data flow for web applications.
In this particular lecture we will write code to establish a WebSocket connection from the CLIENT (browser). In a later lecture we will then set up our websocket server to listen for this incoming request and upgrade HTTP into using WebSockets.
WebSockets act as the signaling channel in WebRTC, allowing browsers and servers to quickly exchange connection setup details like session descriptions and network candidates; this initial coordination through WebSockets enables WebRTC peers to negotiate and establish their direct, real-time media and data connection, after which media flows peer-to-peer and WebSockets are no longer involved in transmission
Its always nice to give feedback to developers in these kind of projects. That's why I have coded up simple HTML + CSS + JavaScript for a simple custom logger. We effectively create a <div> element, add text content inside of it, style it, and then append it to the DOM (Document Object Model).
In this lecture I'll use express to set up a simple Node HTTP server. We'll use Node's native http module and will also install express. I'll also implement middleware (by using the "use()" method that express gives us) to tell Node that any requests to "static" files should be served from the /public directory.
A quick word on setting up a websocket server.
Now that we have an http server instance running in Node.js, the next step is to allow our server to handle incoming websocket upgrade requests. Remember, a websocket connection request has to be initiated by the client and handled by a server. Although we can build a native websocket server from scratch, in this course we will use a common library called ws. However, you can use Socket.io or any other library you wish.
Just like we did on the client side, i want us to set up state on the server to manage all connected users to our application at any given time. I will use the traditional arrays in JavaScript to do this, but I also show you how to use the more modern Map() method.
Registering WebSocket event listeners on the server side is essential because WebSocket communication relies on asynchronous, event-driven interactions. The server needs to listen and respond to various connection and data events to manage real-time communication properly.
When a user logs onto our site, we add them as an object into a global "connections" array. This leads to the next question: what happens if a user disconnects? Well in that case, we want to remove them from this array. That's what we'll do in this simple lecture.
It's time to test our application to see if its all working as expected, thus far.
This is the main section of the course, where we will finally start working with files and transferring them over webrtc. This lecture is a summary of the previous section and I will include the project files here.
In order to send a file to a device, you need the "who". In other words, you need a way to identify where to send the file to. This is the purpose of the receiver's session ID. It will be used to identify which peers need to establish a WebRTC connection. You'll see how it all fits together soon.
A quick explanation of why we have a session ID.
We don't want to allow a sender of a file to incorrectly type an address. We therefore want to check whether the receivers ID they have typed into the <input> box has indeed been added to our connections array on the server and that the user is currently online. There are many ways we can do this. Indeed, we could send a message via websockets, or of course we could use AJAX. We will use the modern Fetch API to perform this check. Let's get crackin'
In the previous lecture we sent a Fetch AJAX request to the server. Before we finalize the code on the client side, let's hop over to our server and write server side code (using our express server) to listen for an incoming AJAX POST request to check whether the ID the sender has inserted into the <input> element is indeed valid.
Now that the server has done its checking and sent a response, we can hop back onto our client side and handle the data. Remember, when using Fetch(), the data is processed as a Promise. Promises are unique to JavaScript, and they have to be "caught" or handled in then() statements. For more info, check out my AJAX courses.
Let's test our application.
In order for a user to select a file to "send" on their disk, they need to use the File API. There are a few ways this can be done in your code, but we'll use the <input type="file"> element.
I want to be clear that although its most common to use the <input> element to invoke the File API, this is not the only way.
In this lecture i'll show you how to quickly create files on the fly, using the command prompt. Of course, you don't have to to do this, but it makes for quick dev.
In this lecture, I demonstrate the process that occurs when a user uploads a file in a web browser using the <input type="file"> element. I explain how the browser allocates heap memory to store the uploaded file, including the associated metadata such as file size, name, and type
Did you know, the input element in HTML for files does not allow us to view the location of the file on the user's system.
In this lecture let's use a simple "IF" statement in JavaScript to check the size property of our file and ensure that the user is not allowed to upload and send an empty file
After this lecture, we'll be done with the basic UI stuff that is associated with a user uploading a file.
You've done it! You've literally coded everything required in order for the user to finally click "send". It's been a long section and a lot of content. So, take a break, enjoy the quiz and i'll see you in the next section of the course where we will establish a WebRTC connection and send the file.
A quick summary of what we'll be doing in this course.
This course is not about WebRTC. However, I want you to be aware of the high level process of how WebRTC works which is why I've put this lecture together. Remember, our goal in this course is to send files over WebRTC to another peer. The first step is to then establish a WebRTC connection. This is done by peer 1 creating and sending whats known as a "WebRTC Offer". The receiver of the file gets this "offer", and then has to prepare their own "answer". These pieces of information is what is used by each side to "find" each other and then ... and only then ... can they establish a direct connection between each other.
Enough theory, let's get into it.
I have included a link to my WebRTC course which is >13.5 hours long that explains how WebRTC works in greater detail. I have also included the project files
Creating a WebRTC offer is the first step in establishing a peer-to-peer connection. The process begins with calling the createOffer() method on the RTCPeerConnection object, which generates an SDP (Session Description Protocol) offer describing the media capabilities, codecs, data channels, and network information of the local peer. This SDP offer acts as a proposal for the connection, outlining what the sender can send and receive.
Once the offer is created, it must be set as the local description using setLocalDescription(), which also triggers the browser to start gathering ICE candidates for network connectivity. Afterward, the offer is sent to the remote peer through a signaling mechanism (such as WebSocket), which allows the remote peer to set it as their remote description and respond with an answer to complete the negotiation.
In this lecutre, I'll show you how the functions were code. I will show you how we use the RTCPeerConnection() object given to us by all browsers to create our WebRTC object. We then have to register and create our data channel and then set our local description.
When the file sender creates an offer in WebRTC, it generates a session description containing details about media or data channel capabilities, network candidates, and connection parameters. This offer is sent to the signaling server in our course via a WebSocket connection, as part of the negotiation process to establish a peer-to-peer connection. The signaling server doesn’t modify the offer but simply acts as a communication bridge to help both peers exchange connection data.
When the signaling server receives the offer from the sender, it relays that information to the intended receiver. The server uses an established signaling channel (identified by the receivers session ID) to forward the offer message. This allows the receiver’s browser or application to process the sender’s session description, set it as a remote description, and then generate an answer that will similarly be sent back through the signaling server to complete the connection setup.
The next step in the process is for the receiver to process the offer. To do this, I want to set up a websocket handler that listens for incoming requests.
Receiving the WebRTC offer involves the receiving peer accepting the incoming Session Description Protocol (SDP) offer that contains proposed media and connection parameters from the initiating peer. The receiver sets this offer as its remote description using setRemoteDescription(), which informs WebRTC of the proposed connection settings. Afterwards, the receiver generates an answer with createAnswer(), which produces an SDP answer reflecting its accepted configurations and possible modifications within the constraints of the offer, then sets this answer as its local description with setLocalDescription(). Finally, the generated answer is sent back to the offerer through the signaling channel, completing the negotiation required to establish the WebRTC peer-to-peer connection
Sending and receiving the WebRTC answer involves the answering peer sending its locally generated SDP answer back to the initiating peer via the signaling channel after setting it as its local description with setLocalDescription(). The initiating peer, upon receiving this answer, sets it as its remote description using setRemoteDescription(), which finalizes the negotiation by synchronizing both peers' connection parameters.
Dealing with ICE candidates in WebRTC involves generating ICE candidates for the local peer, which need to be sent to the remote peer via the signaling channel. These candidates represent potential network paths for communication. Meanwhile, the peer also receives ICE candidates from the remote peer, which must be added to the peer connection’s remote description using addIceCandidate(). Because ICE candidates can be received before the remote description is set on a peer, it’s important to implement a buffer that temporarily holds incoming candidates until the remote description is ready, ensuring smooth handling and avoiding errors.
I know this sounds like a lot, but I'm your wingman and i'm with you EVERY STEP OF THE WAY. Let's get crackin'
In this lecture we are just finishing off all of our logic related to the ICE candidates.
WE are finally done with setting up our entire project, including establishing a WebRTC connection. Before you try your hand at the quiz, let's test the code to make sure everything works.
In modern web applications, files can be read and processed in several ways depending on performance needs and memory constraints. The FileReader API provides simple methods to fully read files into memory as text or binary data. The Blob interface allows working with file slices or chunks without loading the entire file, enabling more efficient handling of large data. The Streams API offers the most granular control, reading files piece by piece as a continuous data stream, making it ideal for progressive uploads or real-time file transfer over WebRTC data channels.
As developers, we have options.
One decision we have to make is how we would like to send our file over the data channel. By "how" I mean what format.
There are only practically only 2 options - we can represent our file binary inside of an ArrayBuffer (chunking), or we could represent the binary as a "stream" using the Streams API. The choice depends on your browser compatibility requirements and your desired level of memory efficiency.
Once the dataChannel triggers the "open" event, we know that the WebRTC connection is successful and that the sender can begin the file transfer process. Let's set up our code.
Every <progress> element in HTML has a "max" attribute. I want to set this to the file size. Let's hardcode the value as 50% of this value for now. Later we will make it dynamic of course.
I then want to test our code
In this lecture let me show you that the FileReader API is available globally in the browser, and that in order to use it, we have to use the famous JavaScript "new" keyword.
When using FileReader to send data over a WebRTC DataChannel without chunking, large payloads can overwhelm the DataChannel's send queue, causing send queue errors and buffer overflows. Since the DataChannel’s bufferedAmount has limits (commonly around 8-16KB depending on the browser and network), sending a large file all at once will fill the send buffer faster than it can be drained, leading to blocked or failed sends. This can cause the DataChannel to stall or produce errors because the send buffer cannot handle excessively large messages at once.
There is no explicit file size limit set by the FileReader API itself. However, reading very large files (such as 5GB) at once with methods like fileReader.readAsArrayBuffer(file) often leads to errors or crashes because the entire file is loaded into memory, and available memory (browser and system) is the actual limiting factor. Browsers and runtime environments have different memory limits and handling capabilities, so large files can cause out-of-memory errors or other exceptions when the file size exceeds what the browser can handle in one go.
For huge files, you should not attempt to read the entire file at once. Instead, use chunked reading by slicing the file into smaller Blob parts and reading those chunks sequentially
When you use FileReader.readAsArrayBuffer() to load a file into the browser's memory, you will notice in the operating system's task manager that the memory heap size increases as the file data is being loaded into memory. This happens because the entire file content is read and stored in the browser’s RAM as an ArrayBuffer. If the user refreshes or closes the browser page during this process, the active memory allocated for the file is released, and the browser’s garbage collector reclaims that memory, causing the heap size to shrink back down.
A quick Q&A on what happens if you attempt to receive a file that is larger than your RAM?
A quick description, in my own words, of what an ArrayBuffer is.
In WebRTC DataChannels, chunking involves breaking large messages into smaller parts to facilitate transmission, while the send queue manages these chunks by temporarily storing them until they are sent over the network.
Sending the maximum size chunk possible can lead to inefficiencies such as increased latency, higher memory usage, and potential network congestion. Smaller chunks allow for more controlled, reliable transmission, quicker error detection and recovery, and better overall resource management, making data transfer more flexible and resilient to network variability.
Blobs are easier to work with, but give you less granular control
Lets finish off our chunking logic, and while we're at it, complete our send <progress> HTML element by dynamically setting the value property to the amount of bytes we've successfully read.
Let's take a break here and test our logic thus far. We should see each chunk of our file being read, and the chunk sizes. Very exciting.
The bufferedAmount property of a WebRTC DataChannel shows how many bytes of data are currently waiting in the send queue to be transmitted over the network. Think of it as measuring how full that “outgoing mailbox” is at any moment. When you use the send() method to push data into the channel, it gets buffered if the network or peer can’t handle it immediately, so bufferedAmount rises. Monitoring this value is crucial for flow control; if it gets too large, you risk filling the buffer and causing errors or stalling. To manage this, you can pause sending more data until bufferedAmount drops to a safer level, often indicated by the bufferedAmountLowThreshold property triggering a bufferedamountlow event. This way, you ensure smooth, reliable data streaming without overwhelming the channel.
In WebRTC, the bufferedAmount property indicates the current number of bytes queued but not yet sent on the data channel, and bufferedAmountLowThreshold is a threshold you set to receive an event (bufferedamountlow) when the queue drains below it, letting you know it's safe to send more data without overwhelming the buffer.
A quick word before we move on.
The bufferedAmount and bufferedAmountLowThreshold properties work hand-in-hand to help manage flow control on a WebRTC DataChannel. bufferedAmount tells you how many bytes are currently queued but not yet sent, while bufferedAmountLowThreshold sets a threshold for what you consider "low." When the amount of buffered data drops to or below this threshold, the DataChannel fires a bufferedamountlow event, signaling it's safe to send more data. This mechanism helps prevent buffer overflow by letting you pause sending when the queue is too full and resume efficiently when there's enough space again, ensuring smooth, reliable data transfer without overwhelming the channel
The bufferedAmount and bufferedAmountLowThreshold properties work hand-in-hand to help manage flow control on a WebRTC DataChannel. bufferedAmount tells you how many bytes are currently queued but not yet sent, while bufferedAmountLowThreshold sets a threshold for what you consider "low." When the amount of buffered data drops to or below this threshold, the DataChannel fires a bufferedamountlow event, signaling it's safe to send more data. This mechanism helps prevent buffer overflow by letting you pause sending when the queue is too full and resume efficiently when there's enough space again, ensuring smooth, reliable data transfer without overwhelming the channel
In the context of sending a file over a WebRTC DataChannel, pausing sending when the send queue buffer gets too full means monitoring the RTCDataChannel.bufferedAmount property, which indicates how many bytes are currently queued but not yet sent. When this buffered amount exceeds a set threshold (bufferedAmountLowThreshold), the sender pauses sending more data to avoid overwhelming the buffer and network.
Resuming sending when the WebRTC DataChannel send queue buffer drains to a lower threshold involves listening for the bufferedamountlow event, which is triggered once the buffered amount of queued data drops to or below the bufferedAmountLowThreshold value. This event signals that the network and receiver are ready to handle more data without risk of overflow or congestion. Implementing this, the sender pauses sending when the buffer is full and only resumes pushing more chunks of the file or data once this event occurs, ensuring controlled flow and preventing stalls or crashes during large file transfers by pacing the sends according to actual network capacity and buffer availability
Before we start adding logic on the receivers side, let's test our code thus far.
This lecture will talk about the logic of peer2 receiving a file over a WebRTC data channel.
We will access the file object from the FileList object returned to us when the user selects the file via the <input> element. We will then create a metadata object and send this, via WebRTC, to the receiver.
This metadata message includes essential details such as the file name, size, type (MIME type), and optionally last modification date or a hash for integrity checks. The metadata is sent as a JSON string or an agreed-upon format, allowing the receiver to prepare for the incoming file by allocating appropriate resources and informing the user about the file details. This initial metadata transmission helps the receiver know the exact file size to monitor received data and determine when the file transfer is complete, enabling proper reassembly and handling of the file data once received.
When a WebRTC data channel is established between peers, the receiving side listens for incoming data by registering an event listener for the "message" event on its RTCDataChannel object. This listener receives a MessageEvent, whose data property contains the data sent by the sender. This mechanism enables real-time, bidirectional data communication independently of the media streams managed by the RTCPeerConnection. The event fires whenever a message is received, allowing the application to process the message as needed (e.g., updating UI or processing data). The listener can be set using addEventListener("message", callback) or the onmessage event handler property.
Extracting file metadata means retrieving and understanding data that describes the essential attributes of a digital file, such as its size, type, creation date, or author, before processing or using the file itself. Metadata provides crucial contextual information that helps systems manage, organize, or display the actual file data properly. In file transfer scenarios, this involves receiving a structured message (usually as JSON) that outlines these details, enabling the receiver to prepare for incoming file chunks, track progress, and validate the transfer. Essentially, file metadata acts as descriptive data about the file's characteristics, guiding how the file should be handled or interpreted.
In this lecture we will create the receiveFile function that will extract the first message received (which is in JSON form) and insert that into a metadata object which we will use to extract crucial pieces of information.
Its finally time to push each ArrayBuffer into our global array to store all the chunks received. While we are at, lets also access our html <progress> element and dynamically update its value property to be equal to the total amount of bytes received.
In WebRTC file transfer, after receiving the complete file data as chunks, a Blob object is created from those chunks to represent the file in a binary, immutable format. The URL interface's createObjectURL() method is then used to generate a temporary URL pointing to this Blob, which can be assigned to an anchor element as a downloadable link. When users click this link, the browser triggers a download of the reconstructed file locally. This approach efficiently enables peer-to-peer file transfers by converting raw binary data into a user-friendly downloadable file without saving it to a server first.
Before we get too excited (which I already am), let's test our code to see if everything is working as expected.
I will also take this time to talk about a question you may or may not be thinking ... "what if our metadata message on the data channel is greater than one fragment? In other words, should we not be chunking our metadata message and reassembling those chunks on the receivers side?"
great question. let me answer it.
A Blob object is created from an array of ArrayBuffers because the URL.createObjectURL() method requires a Blob input to generate a temporary downloadable URL. While ArrayBuffers represent raw binary data in memory and are mutable, Blob represents an immutable file-like object suitable for file handling in browsers. Creating a Blob from received ArrayBuffers enables packaging the binary chunks into a single file-like entity that the browser can manage and reference as a downloadable resource via the URL interface, facilitating an efficient way to let users download reconstructed files from binary data transferred over WebRTC
A quick reminder before we move on.
This trips up 99% of developers.
Closing a WebRTC data channel gracefully involves calling its close() method, which triggers an asynchronous process where the channel enters the "closing" state, ceasing the acceptance of new data while still allowing any buffered or in-flight data to be transmitted fully. This ensures all pending messages are delivered before the underlying transport and protocol layers close the channel. Once this process completes, the channel's state changes to "closed," and a close event is fired to notify that the channel is fully closed. Properly closing the data channel before closing the associated peer connection helps prevent data loss and resource leaks, allowing an orderly shutdown with adequate cleanup and user interface updates indicating the connection's termination
As you now understand, closing a WebRTC connection starts by closing the data channel first using its close() method and listening for its close event to confirm that all buffered and in-flight data has been sent and the channel is fully closed. Only after the data channel is closed should you call the close() method on the RTCPeerConnection object, which terminates all underlying media and data transports, releases network resources, and transitions the connection to a closed state. This order ensures that the data channel closure signaling reaches the remote peer properly, preventing data loss or inconsistent states. Proper closure management also helps free system resources and avoids memory leaks, allowing the application to cleanly tear down the WebRTC session without errors or lingering connections.
(advanced: note that the objects we created in JavaScript are still going to be populated with the offer and answer and data channel information, so we may need to deal with that too).
When working with WebRTC DataChannels and PeerConnection (PC) objects for sending files, it is necessary to properly close and clean up these objects to free resources and avoid memory leaks. You should first call the close() method on the DataChannel objects to gracefully shut down the data transfer and notify the remote peer. After closing all DataChannels, close the associated PeerConnection objects. Only after calling close() on these objects should you set their references to null in your code. Setting them to null after closing helps prevent accidental reuse of closed connections and allows JavaScript's garbage collector to reclaim the memory they occupied, which is especially important in long-running applications or single-page apps. This order—close first, then nullify—is crucial for proper resource management and memory clearance in WebRTC applications.
I will add simple logic to allow a user to stop sending a file if they so choose. We could spend a few hours improving the UI experience but this course is focused on file transfers so I will merely display an "alert" (yuck, I know) to the user telling them to refresh their page.
I was about to upload the ending files and something didn't sit right with me. I noticed that when we call our readChunk() function, we require an "offset" argument. However, when we listen for the bufferedamountlow event, we don't pass an argumnet into the function, which results in offset being undefined. Although it doesn't break our platform, its not perfect.
So without further ado, lets fix it.
Ending files
Blobs and ArrayBuffers are both used to handle binary data in web development but serve different purposes: a Blob is a file-like, immutable container for raw binary data with metadata like size and MIME type, ideal for representing files or data chunks for storage, uploading, or display; in contrast, an ArrayBuffer is a fixed-length, mutable memory buffer that stores raw binary data without format, accessed and manipulated via typed arrays or views for precise reading and writing of binary content; starting files with Blob allows easy handling of file data in web APIs, while ArrayBuffers enable low-level binary operations and data processing.
The File API gives us access to the slice() method on Blob and File objects, which allows us to create a new Blob representing just a part (a "slice") of the original file’s data without copying or altering the original file. They call it "slice" because it works like slicing a segment out of a larger piece—similar to cutting a slice from a big cake—so you can work with or send just that smaller piece. This naming helps convey the idea of taking a portion of the data for processing, making file handling efficient and intuitive when transferring large files in chunks.
Lets get coding. The File interface's Blob object allows direct slicing and sending of file data chunks without needing the FileReader API for intermediate reading or conversion. The DataChannel's send() method supports sending Blob objects natively, enabling efficient transmission of file slices created via file.slice(), which are immutable. This direct sending avoids the overhead of reading with FileReader, simplifying code and improving performance, especially for large files, since the Blob can be sent as-is in chunks over the DataChannel to the remote peer.
Latest files using Blob approach.
Compression can significantly improve transfer efficiency by reducing the amount of data sent.
Pretty obvious, right?
The Pako library is a popular JavaScript implementation of the zlib compression format, providing fast and efficient compression and decompression in the browser. Using Pako, developers can compress file slices or binary chunks before sending them over a DataChannel, which helps lower bandwidth usage and speeds up transmission, especially on limited or slow networks. On the receiving side, decompression with Pako restores the original data before processing or saving. Integrating Pako for compression with WebRTC DataChannels requires chunking files into manageable sizes (commonly around 16 KB), compressing each chunk, sending it, and decompressing on receipt. This method optimizes peer-to-peer file sharing by balancing performance, reliability, and network constraints. However, note that compression adds CPU overhead and may not always be beneficial for already compressed file types like video or images
Data compression is a process that reduces the size of data by encoding it more efficiently. It looks for patterns or repeated sequences within the original data and replaces them with shorter codes or references. This reduces the total number of bits needed to represent the data. When the data is needed again, a decompression process reverses these changes to restore the original data exactly (in lossless compression). Compression makes it faster to send or store data because there is less information to handle, but it requires some processing power to compress and decompress.
In the context of WebRTC DataChannels, sending files often becomes more efficient when the files are compressed before transmission. Compression reduces the size of the data payloads, which helps to lower bandwidth consumption and speed up transfer times across the network. The Pako library, a JavaScript implementation of the zlib compression algorithm, is commonly used to compress data in web applications. In this lecture, we will focus on sending compressed ArrayBuffers; after slicing the file into manageable chunks, each chunk is compressed using Pako before being sent over the DataChannel. This approach optimizes data transfer by minimizing the amount of data sent while maintaining integrity.
In this lecture we will implement robust code to ensure that whether we receive a Blob or an ArrayBuffer, we always convert the data into a uInt8Array so that we can inflate the data to original size.
A quick heads up that trying to compress images can cause problems.
When using Pako to compress files over WebRTC, compressing images can sometimes result in a larger file than the original because image formats like PNG and JPEG are already compressed using specific algorithms optimized for their data structure. Applying a general-purpose compression library like Pako (which implements DEFLATE compression) to such images may fail to find additional redundancy to reduce size and can even add overhead, making the compressed file bigger.
The easy solution is to reduce your chunk size before compressing and sending. By using smaller chunks, even if compression expands some chunks, they stay within the maxMessageSize limit, preventing errors.
this is a two part solution. First lets implement some logic on the sender's side that only compresses a file if the file type is not an image. We get MIME information in relation to files, and we can access a handy JavaScript method on all String data types called "startsWith" that will allow us to perform a simple IF check.
This is the final solution to our image problem, and that is to also perform a similar IF check on the receivers side to only decompress files if the file type is not an image.
There was a logical error in my code, which I missed earlier. Now its' time to fix it ONCE AND FOR ALL, so we can finally see our compression code work 100%.
You've come a long way. When sending files over a WebRTC data channel with compression, the process involves working directly with the file's binary data, typically represented as an ArrayBuffer. This raw data is then passed through a compression step using the Pako library, which implements the DEFLATE algorithm to reduce data size for faster transmission. However, certain file types—especially images like JPEG and PNG—are already compressed by design, so applying additional compression can be inefficient or even increase file size. To handle this, we built logic to detect such cases and skip unnecessary compression. Well done on reaching this stage; attached are the latest coding files. In the next section, we’ll explore how to perform this efficiently using the Streams API.
An intro to this final section.
The Streams API lets you create a reader from a readable stream to handle data chunk by chunk. When reading a file, you get a reader from the file’s stream and call its read() method repeatedly to asynchronously receive each chunk as it becomes available. Unlike the older FileReader API, where you had to manually manage loading and slicing, the Streams API automatically controls chunk sizes and flow, simplifying the process of reading files piecewise. This makes it ideal for sending chunks one by one over WebRTC DataChannels without manually specifying how to break the file apart.
Once you invoke the stream() method on your file, why do you have to call the getReader() method? Isn't this overkill? I'm so glad you are thinking about these questions.
You know the theory, that the modern approach is the File Streams API. To use the streams API, you need to access your file object, and execute a method called streams(). This creates a Readable Stream, but it doesn't tell the application that you intend on reading the contents of the stream. To do this, you need to call yet another function called getReader().
I'll show you this, and more, in this lecture.
A quick "did you know" tip.
One challenge with the Streams API is that you generally have no direct control over how chunking is done or the size of each chunk produced by the readable stream. This can be problematic when the chunk size is critical, such as when sending data over a WebRTC DataChannel, which has a specified maxMessageSize limit for the largest message it can send. Since the Streams API chunk size is implementation-defined and often not predictable—commonly around 64 KB but can vary—you risk getting chunks larger than the DataChannel's allowed maximum size. This raises the important question of how to handle situations where the Streams API produces chunks exceeding WebRTC's limits, potentially requiring additional slicing or buffering logic to split chunks manually before sending them over the DataChannel to avoid errors or message rejections.
It is surprising that so many developers always revert to using the Streams API, but few understand why. Is there any real difference between this approach versus using Blobs or the FileReader API?
The Streams API includes a BYOB (Bring Your Own Buffer) mode that allows developers to specify the exact size of chunks to be read by providing their own buffer. This mode addresses the chunk size control problem by enabling precise allocation of buffers that match the maximum message size allowed by the WebRTC DataChannel. With a ReadableStreamBYOBReader, you can call read() with a developer-supplied ArrayBufferView of the desired size, ensuring that each chunk read from the stream fits within the WebRTC message size limits. This capability eliminates the risk of receiving oversized chunks from the default stream reader and simplifies handling chunked data transmission over DataChannels.
Its a bitter sweet moment, because we've reached one of the last lectures of this course, where will finally use the BYOB mode of the Streams API to specify the exact size of chunking we desire. This will solve the issue we were facing earlier and our application will work smoothly. WELL DONE.
Ending files a huge congrats.
Outro video
*** THE ONLY WEBRTC COURSE THAT FOCUSES ON WEBRTC DATA CHANNELS TO TRANSFER FILES ***
Build a file transfer app using WebRTC DataChannels
Understand how data channels can be used for sending files
Master the FileReader API, Streams API and Blobs
Work with Blobs to assemble received file chunks into a downloadable link
Learn the importance of sending file metadata (filename, size, type)
Explore chunking strategies to split files into manageable pieces
Gain insight into browser memory handling
Understanding how to avoid memory bloat by streaming data rather than loading entire files at once
Equip yourself with best practices for WebRTC signaling and connection management to build robust and scalable file sharing apps.
And more!!!
Feel like diving in and touching the bottom? Now’s your chance. Code an app that sends large files to another device. Bypass servers by using WebRTC.
TWO (OF MANY) REASONS WHY THIS COURSE IS A GAME CHANGER:
#1. You can look forward to an advanced course which is structured to motivate you. Together we build a complete file-sharing project from scratch without relying on servers or intermediaries. This course provides practical skills like no other.
#2. You will use WebRTC for this, which is a modern, private and secure technology. This course covers WebRTC sufficiently for its purposes, but those wanting a more comprehensive overview of WebRTC may consider my specialized WebRTC course."
Other mainstream sites (like SendBig, TransferNow, WeTransfer, etc) use servers and traditional methods to send files. This course teaches you how to build peer-to-peer file transfers using cutting-edge WebRTC technology, making your file sharing faster, more secure, and incredibly efficient. No servers, no middlemen, just direct device-to-device transfer in real time. How amazing is that?
Sending files directly between devices is a critical, high-demand skill in modern web applications, and this course teaches you how to do it using cutting-edge WebRTC. You will master building WebRTC peer-to-peer file transfers that bypass servers completely, ensuring file sharing is fast, secure, and efficient.
Why sending files is complicated?
The journey of transferring a file over WebRTC follows this path:
"file on senders hard drive → browser memory → WebRTC transfer -> receiver’s browser memory → receiver’s disk"
Sounds simple, right?
But here’s the catch: if we try to load the whole file into browser memory at once, we hit RAM limits imposed by browsers. To overcome this, we use chunking, breaking the file into manageable pieces.
But wait, chunking alone isn’t the end of the story. WebRTC data channels weren’t initially designed to handle massive messages. Some browsers like Mozilla allow sending huge fragments—up to 1 GB thanks to their EOR flags—but Chrome is far stricter, limiting fragments to just 250 KB. That’s tiny! So not only do we chunk the file for reading, but we must also send those chunks in small pieces to fit browser protocol limits.
Here’s where it gets tricky: your browser often reads chunks and calls the send() method faster than WebRTC can actually transmit them. This creates a send queue—a backlog that, if ignored, can crash your connection with errors. That’s why you’ll master how to monitor the bufferedAmount and handle the “bufferedamountlow” event, ensuring smooth data flow and keeping your connection stable and alive.
This course makes these complex challenges simple and practical, turning you into a savvy WebRTC file transfer expert!
Why this Course Matters:
Secure Peer-to-Peer File Transfer: Learn how WebRTC enables direct device-to-device communication, so no servers are needed to send files.
Real Project Setup: Build a native HTTP server with Node.js and create a WebSocket signaling server, then establish a full WebRTC connection.
Deep Dive into Files: Understand what files really are at the binary level, and why browsers require files to be read into memory before sending.
Practical Use of the FileReader API: Master the most popular API for reading files into browser memory and the necessity of chunking files to avoid RAM limits.
Handling WebRTC DataChannel Limits: Discover browser-specific constraints like Chrome’s 250kB max fragment size and strategies for chunking file data and handling the send queue using bufferedAmount and bufferedamountlow events.
Adding Compression: Explore optional compression techniques and the challenges of compressing already compressed image formats like PNG and JPEG.
Alternatives to FileReader: Learn how to send files as Blobs directly for different developer needs.
Modern Streams API: Utilize the Streams API for file transfer, intelligently managing chunk sizes with the "bring your own buffer" (BYOB) method for precise control.
Metadata and File Reconstruction: Implement logic to send metadata and reassemble chunks on the receiver side, creating downloadable URLs from blobs with the URL.createObjectURL method.
Interactive Learning: Benefit from quizzes, write-ups, and thorough explanations designed to make you a WebRTC file transfer grandmaster.
What You Will Build:
In this course you'll build a fully functional file transfer application that allows a sender to select a file, send it chunk-by-chunk over a WebRTC data channel, and create a download link on the receiver’s end, ready for users to save their files locally.
Course Highlights:
Understanding files as computer and browser objects.
Setting up essential signaling with WebSocket and Node HTTP servers.
Overcoming browser memory limits with chunked reading and sending.
Handling WebRTC data channel constraints and send buffer management.
Adding file compression with practical caveats and solutions.
Exploring multiple APIs including FileReader, Blob transmission, and Streams API with BYOB.
Creating reliable file reconstruction and user-friendly download links.
Experiential learning through real project code, quizzes, and explanations.
But wait, there's more!
We’ll dive into adding compression options. If you want to squeeze those files down even smaller. Here’s the twist: image files like PNG and JPEG are already compressed by default, so trying to compress them again can blow your mind—and your file size—by making them bigger than the original! Sounds crazy, right? Don’t sweat it though, I’ve got your back, and together we’ll crack that puzzle.
Meet Your Instructor: Clyde
Clyde is a lifelong coding enthusiast who has been hooked on computers since he was 7 years old. With years of hands-on experience in web development and a passion for teaching, he brings a treasure trove of practical knowledge and real-world insights to this course. Known for his engaging and approachable style, Clyde breaks down complex concepts into clear, enjoyable lessons that anyone can follow.
You might be wondering, is he an AI? Nope, Clyde is 100% human (or is this something a robot would say ?!), and he’ll be right there with you throughout every lecture, your dedicated wingman as you tackle challenges, celebrate breakthroughs, and master the skills step by step.
This course is far from a solo adventure. With Clyde as your companion, you’ll feel supported and inspired as you journey from beginner to WebRTC file transfer pro. Let’s get crackin’
Why This Course Is Different:
It’s comprehensive, hands-on, and rooted in real-world challenges like chunking, buffer management, compression issues, and signaling. Unlike generic tutorials, you get deep understanding and practical skills that today’s employers and projects demand.
Enroll now to become an expert in peer-to-peer file transfer with WebRTC. Advance your development skills, build secure, efficient apps, and join a thriving community of forward-thinking web developers.
Let’s get crackin’