Monday, March 4, 2013

Understanding TCP SLOW START

In my opinion, TCP Slow Start is a rudimentary concept that should be mastered by any Network Engineer. It maybe a good idea to read through my previous post about Window Size and Scaling before continuing. To understand TCP Slow Start there are a few terms that we must be familiar with. 

TCP Slow Start - Congestion control mechanism which controls the growth of the sending rate.  
IW - Initial Window 
CWND - Congestion Window (The send window of the sender)
RWND - Receive Window (The window size of the receiver) 
SMSS - Sender Maximum Segment Size (Maximum amount of bytes that can be stuffed into a packet on the sender side) 
ISST - Initial Slow Start Threshold
SST - Slow Start Threshold
FlightSize - Amount of unacknowledged data that can be on the wire. This is usually set to the CWND however the actual formula is min(cwnd, rwnd). In most cases the CWND will be smaller than the RWND. 

TCP Slow Start is a congestion control mechanism developed because of the OLD TCP implementations that allowed the sender to stuff the wire with RWND amount of data at once and this led to issues which rendered TCP useless. Imagine the sender was able to stuff the wire with 10mbps of data at once but the router in the middle could only handle 56kbps therefore dropping the packets. As we all know dropped packets in the network will adversely affect your throughput because TCP will cut back the sending rate every time it notices packet loss through a timeout (RTO = Retransmit Time Out).  

In this blogtorial, we will walk through the TCP Slow Start process and try to understand how it works and how it benefits us. Most of the details will be from RFC 5681 and a few other RFCs, however keep in mind that different TCP implementations will have slight variations to the Slow Start process.

The overall concept to the TCP Slow Start process is very simple -- probe the path to the destination to find an optimal send rate to where packet loss can be kept at a minimum. Take a look at the diagram below and let's visualize the concept.

As you can see too many sentences are being thrown at the receiver and sentences are lost. This means there is confusion and the receiver would have say "Can you repeat the last sentence?". Same is true for TCP connections, if the sender stuffs the wire with too many packets at once and they get lost in transit then the packets will have to be retransmitted, the receiver would have to reorder, the application will experience delays, the throughput will be negatively affected -- pretty much everything goes wrong when you have packet loss. So how does TCP Slow Start helps with this issue?

Basic algorithm of the TCP Slow Start is as follows.

Set IW at a relatively low rate, set the ISST (Initial Slow Start Threshold) to the RWND, and exponentially start increasing the send window up to the ISST then go into congestion avoidance mode where it will linearly increase the send window every transmission round till there is packet loss.

According to RFC 5681, the send rate or the initial window should be set to the following. Side note -- send rate is min(IW, RWND).

If SMSS > 2190 bytes:
    IW = 2 * SMSS bytes and MUST NOT be more than 2 segments
   If (SMSS > 1095 bytes) and (SMSS <= 2190 bytes):
    IW = 3 * SMSS bytes and MUST NOT be more than 3 segments
   if SMSS <= 1095 bytes:
    IW = 4 * SMSS bytes and MUST NOT be more than 4 segments

So if the receiver advertises that 100,000 bytes can be stuffed in the wire without acknowledgement, the sender will not send 100,000 bytes right from the start but rather will slowly (or exponentially) get to that receive window.

Once the Initial Window is calculated and set, it is now time to set the Initial Slow Start Threshold and this is usually set to the advertised window by the receiver or something large depending on the TCP implementation. Until the slow start threshold is reached the sender will increase his cwnd by # of ACKs received times SMSS. So in mathematical terms if cwnd < slow start threshold then cwnd += (# of ACKs received) (SMSS) essentially doubling the window at every transmit round till the slow start threshold is reached. Note that a follow up article will discuss how TCP acts when cwnd > slow start threshold -- so we will be discussing congestion avoidance mode, fast retransmit and fast recovery (just one of many LOSS recovery algorithm).

Let's visualize what happens in slow start and compare it with a wireshark capture.

Once the TCP connection established after the 3-way handshake (perhaps a separate blogtorial), then sender sends 4 packets (sender's initial congestion window aka cwnd) and the sender waits and receives 4 ACKs so the sender increases the cwnd by the # of ACKs received so now the sender sends 8 packets (cwnd is now increased by 4) and so on. This continues till the sender reaches the slow start threshold. Once this threshold is reached the sender will go into congestion avoidance mode in which cwnd is increased by 1SMSS per transmission round no matter how many ACKs are received. In mathematical terms cwnd += 1SMSS per transmission round in congestion avoidance mode.

So let's take a look at a wireshark capture taken as I was downloading winrar from the internet. If you would like to download my wireshark capture to follow along please click here.

Click on packet #2 and go to "Statistics" --> TCP StreamGraph --> Time-Sequence Graph (Stevens).

Things to note about this graph.
  • Notice the clustering for packets about every 50ms or so. This is because of the round trip time between my computer and the server is about 50ms. Every 50ms the server sends the # of packets that the CWND (congestion window) will allow and waits for my acknowledgment.
  • This is not a true exponential curve doubling itself every transmission round and we'll discuss that later in the blogtorial.
Zoom into the graph by clicking the middle mouse button and zoom into the 1st cluster. Notice that there are 3 packets which tells me that the flightsize of the server (remember flightsize is the rate at which sender can send) is 3 packets. This is in line with RFC 5681 which states

   If (SMSS > 1095 bytes) and (SMSS <= 2190 bytes):
    IW = 3 * SMSS bytes and MUST NOT be more than 3 segments

As you can see we are sending ACKs back and when you zoom into the next cluster and notice we received more packets. If you zoom out and zoom back in the next cluster you will notice more packets and every transmission round we seem to get more packets at once. This goes on for a while till my computer starts sending 1 ACK per 2 segments received from the server so the TCP slow start is not a "true" exponential growth. The reason my computer starts sending 1 ACK per 2 segments received from the server is known as "Delayed ACK" which is a topic all in itself and we'll save this for another blogtorial. 

So there you have it -- TCP Slow Start in a nutshell. Basically set the initial send rate to relatively low, then exponentially increase the send rate based on the # of ACKs received. I tried to keep this post as simple as possible, so any average person can read this and get a good understanding of TCP Slow start -- hopefully it was easy to follow. So it got me thinking -- TCP Slow Start which grows at an exponential rate is not really slow so why is it called the "Slow" start? This is because when TCP was first implemented the sender was able to send the entire RWND at the very first transmission round (say 10mbps) but the routers interconnecting the sites couldn't handle 10mbps so they would start dropping packets rendering TCP useless till congestion control was implemented.

I will close this blogtorial with a statement by Mr. Van Jacobson who many consider to be the father of TCP/IP congestion control "how can we ever troubleshoot the issue if we don't even understand how TCP/IP works to begin with".

Many more articles to come so stay tuned.

Please subscribe/comment/+1 if you like my posts as it keeps me motivated to write more and spread the knowledge.


  1. great tutorial. Thanks for sharing and writing this.
    Awaiting for congestion avoidance mode, fast re-transmit and fast recovery , as well as delayed ack tutorials.

  2. This is really a wonderful post! Thanks for the summary!!!