Forum Discussion

JonathanB_18379's avatar
JonathanB_18379
Icon for Nimbostratus rankNimbostratus
Aug 18, 2015

HTTP Request / Response Chunking

Please be patient with me as I'm fairly new to the F5 realm. I am filling in for a senior network engineer who is out of the office.

 

I am troubleshooting an issue with a high profile application behaving differently in test than in prod. Basic LTM pools and standard VS. 3600's running 11.2.1.

 

They are seeing response time issues, so I'm comparing test with production. One of the few differences that I’ve found is that the default http profile on test is setup for “Request Chunking = Preserve” and “Response Chunking = Selective”. In prod they are both set to Preserve.

 

Chunking sounds like fragmentation, but I’m not sure from reading the help file.

 

Could someone help explain this to me so I know if I'm running down the correct rabbit trail or not?

 

Thank you so very much for your help in advance - Jonathan

 

1 Reply

  • Hamish's avatar
    Hamish
    Icon for Cirrocumulus rankCirrocumulus

    Chunking is a concept introduced in HTTP/1.1 in order to reduce the amount of local storage required on the web server for sending responses.

     

    In order to perform HTTP keepalives, the client has to know when the response has finished. With earlier versions of HTTP the response finished when the connection closed. So the client knew instantly the content was finished. With connection keep-alive the client has to know exactly when the response has finished. So the server does the obvious thing and specifies in the header the size of the response. But in oder to know the size, the server has to accumulate the WHOLE response so it can count the bytes and send the header.

     

    So... That accumulation takes time (Time which could be sent streaming the response). And of course space. So the solution was a split the response into chunks. The server accumulates a small amount of response (e.g. Up to 1024 Bytes, the size of arbitrary) and sends a header telling the client the response is chunked. Then before each chunk the server tells the client I'm about to send X Bytes. It then sends X Bytes. The last chunk is always 0 length. i.e. no more data... The client then just has to accumulate, and knows when the response is finished. So it can re-use the connection to send another request.

     

    So... It's very much like fragmentation as in the response is fragmented into chunks of data.

     

    Now... The difference between the settings... If any data needs to be re-written by the LTM then you'll need to set re-chunking. I usually use always (selective has usually meant problems for most times I've needed it). Never means no re-chunking will be done. Which one is quicker? Depends on the environment. The speeds of circuits will affect it. You may have different MTU's on client/server sides etc...

     

    There's a good article by Deb Allen online here => https://devcentral.f5.com/articles/ltm-http-profile-option-response-chunking which probably explains it a bit better than I have :) A lot more detail and time taken over it for a start.

     

    H