Difference between revisions of "Designing network protocols"

From HashVB
Jump to: navigation, search
(Another log)
(Rewrote into a nice article)
Line 1: Line 1:
<pre>
+
== TCP ==
<@QueenDee> ok, so you can NOT rely on the data being recieved in the same chunks as they are snt
+
The main trap most people fall into while "designing" a network protocol over TCP is to assume that every "chunk" sent will be received by the program at the other end in exactly the same chunk.
<@QueenDee> they will be in the same order, yes, but be broken into chunks as the server, and any devices along the route see
+
            fit
+
< lsdigital> well kinda
+
<@QueenDee> So, you need to receive ALL data into one buffer
+
< lsdigital> yeh
+
< lsdigital> like
+
<@QueenDee> there is no kinda about it
+
<@QueenDee> thats how it is
+
< lsdigital> i hwell
+
< lsdigital> there is only one server
+
< lsdigital> but ya
+
< lsdigital> all of it comes at once
+
< lsdigital> and
+
< lsdigital> vb isnt parsing it all
+
< lsdigital> ill get like
+
< lsdigital> the first 5 lines
+
<@QueenDee> shut up and listen
+
<@QueenDee> that is EXACTLY what I have just explianed
+
< lsdigital> k
+
< Deathmaster> LOL dee is already nervous ;))
+
<@QueenDee> the chunks the server sends will NOT be the chunks you receive
+
< logicerror> lol
+
<@QueenDee> there needs to be some delimiter character (new line normally) to say process this
+
<@QueenDee> so...
+
<@QueenDee> you read all your data into a single buffer, then parse that
+
<@QueenDee> read data in and append to buffer
+
<@QueenDee> check for crlf
+
<@QueenDee> if crlf exists, read up to it, and remove form buffer
+
<@QueenDee> repeat check for crlf
+
</pre>
+
  
<pre>
+
This is most definitely NOT the case due to several factors:
<@QueenDee> each time you get "new data" you recieve it ALL into a string
+
# Packet sizes at the senders end. Chunks of data will be split into packet sizes that the network can handle.
< Chris_Tucker> yes
+
# Buffering on the receivers end. Depending on how the client is written, packets of data will be buffered until the application is ready to receive the data. When it does, it may receive multiple complete or part packets but it will ALWAYS be in the same order it was sent.
<@QueenDee> append it to a buffer which contains any pending data so far (set later on)
+
 
<@QueenDee> THEN
+
You need to take these into account and design your protocol to have certain delimiters. Common delimiters are new lines (CR LF) or spaces and nulls. Some even used a fixed chunk size or send a header specifying the size before the main data.
<@QueenDee> you look for a vbcrlf in that buffer
+
 
< Chris_Tucker> ok
+
=== Sending data ===
<@QueenDee> if there is one, read up to it, process it and remove it form the buffer
+
To make "nice" code, you would normally have a function that you pass a command to and it does any formatting necessary to parse it over the network connection. For this sample, I will imaginatively call it Send().
<@QueenDee> look for the crlf again
+
 
<@QueenDee> if there is one, read up to it, process it and remove it form the buffer
+
''These samples are written using an imaginary Socket class and is not based around any particular library.''
<@QueenDee> reapeta those steps UNTIL there is no more crlf
+
 
< Chris_Tucker> i understand that, but i have never worked with buffers before, is it just a string?
+
This code uses a simple new line as the delimiter
<@QueenDee> if you are lucky, you will have nothign left in teh buffer
+
 
<@QueenDee> but it is possible you will have a few stray bytes left
+
Sub Send(ByVal Command As String)
<@QueenDee> so you leave them for next time
+
  Socket.SendData Command & vbCrLf
-!- Arhangel [~ed@netrun-56-17.cytanet.com.cy] has joined #vb
+
End Sub
<@QueenDee> yes.
+
 
< Chris_Tucker> well, problem
+
This code is lazy as it assumes it will be sent without a problem. Of course, when you write it, you will add error handling or check how much was actually sent and report this back to the main application.
< Arhangel> has anyone had experience with RDL?
+
 
< Chris_Tucker> i can see that if stuff piles up, it could end up in a VERY dangerous situation
+
=== Receiving data ===
<@QueenDee> it wont pile up
+
When receiving data, you have no idea how you will actually receive it. To get around this, you will need to receive all data into a static buffer then perform your parsing on that.
< Hypnotron> Chris_Tucker: you read in all data immediately in a loop and process it. nothings gonna pile up
+
 
<@QueenDee> because you read EVERYTHING that is in teh buffer that is "complete"
+
When parsing your data, look for the first delimiter in the buffer, read any data before it and process it. You then remove this "handled" data from the buffer and repeat the process. You need to keep doing this until there is no occurrence of the delimiter in the buffer. At this point, there is either nothing left or "part" of a command waiting but is not complete so you leave it for next time.
<@QueenDee> the only thing potentiall left over is half a command
+
 
<@QueenDee> which will be completed when the next lot of data is received
+
When you next receive some data, you append it to the buffer containing the part command from the previous data and carry on as usual. This time you will find the complete command ready to be parsed.
</pre>
+
 
 +
Sub Socket_NewData(ByVal Data As String)
 +
Static Buffer As String 'This contains any data waiting to be processed
 +
Dim Command As String 'The current command as it is parsed
 +
 +
  'Append the new data to the buffer
 +
  Buffer = Buffer & Data
 +
 
 +
  'Loop until no more delimiters
 +
  Do While Instr(Buffer, vbCrLf) > 0
 +
    'Parse out the command before the new line
 +
    Command = Mid(Buffer, 0, Instr(Buffer, vbCrLf) - 1)
 +
    'Remove the command from the buffer
 +
    Buffer = Mid(Buffer, Instr(Buffer, vbCrLf) + 1)
 +
   
 +
    'Do what we need to with the command
 +
    ProcessCommand Command
 +
  Loop
 +
End Sub
 +
 
 +
The important parts of this code are the Static variable and the loop. The static variable keep any data from one call to the next so you don't loose anything and the loop processes all the data it can so you dont get lagged.
 +
 
 +
== UDP ==
 +
UDP is another IP protocol and gets around this whole buffering problem although by design, the data is NOT guaranteed to arrive in the order it is sent or even at all. It is also connectionless and allows for multiple senders to be received by one receiver.
 +
 
 +
UDP is normally used for streaming video/audio protocols where the extra error checking and connection is too much of an overhead and there is no need for the reliability.

Revision as of 00:28, 1 December 2005

TCP

The main trap most people fall into while "designing" a network protocol over TCP is to assume that every "chunk" sent will be received by the program at the other end in exactly the same chunk.

This is most definitely NOT the case due to several factors:

  1. Packet sizes at the senders end. Chunks of data will be split into packet sizes that the network can handle.
  2. Buffering on the receivers end. Depending on how the client is written, packets of data will be buffered until the application is ready to receive the data. When it does, it may receive multiple complete or part packets but it will ALWAYS be in the same order it was sent.

You need to take these into account and design your protocol to have certain delimiters. Common delimiters are new lines (CR LF) or spaces and nulls. Some even used a fixed chunk size or send a header specifying the size before the main data.

Sending data

To make "nice" code, you would normally have a function that you pass a command to and it does any formatting necessary to parse it over the network connection. For this sample, I will imaginatively call it Send().

These samples are written using an imaginary Socket class and is not based around any particular library.

This code uses a simple new line as the delimiter

Sub Send(ByVal Command As String)
  Socket.SendData Command & vbCrLf
End Sub

This code is lazy as it assumes it will be sent without a problem. Of course, when you write it, you will add error handling or check how much was actually sent and report this back to the main application.

Receiving data

When receiving data, you have no idea how you will actually receive it. To get around this, you will need to receive all data into a static buffer then perform your parsing on that.

When parsing your data, look for the first delimiter in the buffer, read any data before it and process it. You then remove this "handled" data from the buffer and repeat the process. You need to keep doing this until there is no occurrence of the delimiter in the buffer. At this point, there is either nothing left or "part" of a command waiting but is not complete so you leave it for next time.

When you next receive some data, you append it to the buffer containing the part command from the previous data and carry on as usual. This time you will find the complete command ready to be parsed.

Sub Socket_NewData(ByVal Data As String)
Static Buffer As String 'This contains any data waiting to be processed
Dim Command As String 'The current command as it is parsed

  'Append the new data to the buffer
  Buffer = Buffer & Data
  
  'Loop until no more delimiters
  Do While Instr(Buffer, vbCrLf) > 0
    'Parse out the command before the new line
    Command = Mid(Buffer, 0, Instr(Buffer, vbCrLf) - 1)
    'Remove the command from the buffer
    Buffer = Mid(Buffer, Instr(Buffer, vbCrLf) + 1)
    
    'Do what we need to with the command
    ProcessCommand Command
  Loop
End Sub

The important parts of this code are the Static variable and the loop. The static variable keep any data from one call to the next so you don't loose anything and the loop processes all the data it can so you dont get lagged.

UDP

UDP is another IP protocol and gets around this whole buffering problem although by design, the data is NOT guaranteed to arrive in the order it is sent or even at all. It is also connectionless and allows for multiple senders to be received by one receiver.

UDP is normally used for streaming video/audio protocols where the extra error checking and connection is too much of an overhead and there is no need for the reliability.