Large file uploads with a OneConnect profile
I have fairly vanilla http virtual server (running 11.6.1) sitting infront of a websphere application (4 nodes load balanced).
The config is as follows:
- TCP: default tcp
- HTTP profile: default http
- Source Address: Auto Map
- Rewrite profile: myRewrite
- Access Profile: myAccessProfile
- OneConnect Profile: default oneConnect
- Session persistence: Cookie
I'm having an issue with Internet Explorer 11. When a user uploads a large file (1GB or larger) via the interface. After every 900-950MB the upload stops and the file is shown as partially complete. The user can resume the upload and it does continue, but I need to stop the upload from pausing mid upload.
The upload will successfully complete in one attempt if the user does it with Chrome.
The web application uses a jQuery tool to do the file upload. The file is chunked up into 10MB segments and sent through in multiple http requests.
I added some log outputs to various events as a first step, and it was the LB events that I found most interesting.
when LB_SELECTED {
log local0. "member selected: [LB::server]"
}
when SERVER_CONNECTED {
log local0. "from [IP::client_addr]:[TCP::client_port] to vip [IP::local_addr]:[TCP::local_port]"
}
when LB_FAILED {
log local0. "whoops LB failed - [event info]"
}
With the OneConnect profile in place, I see the LB_SELECTED being fired for each of the 10MB segments being sent through as you would expect. When it gets up around the 900MB mark it seems to establishes a new TCP connection (this takes about 70 seconds). I can see the SERVER_CONNECTED event firing and the client port has changed. This occurs when using both IE and Chrome. However when using IE I also see a LB_FAILED event at the same time, but the [event info] is blank. It's at this point the download pauses in IE.
I fired up tcpdump and captured the entire upload process. Right before the new TCP connection is established I can see a RST,ACK sent from the F5 VIP address to the client. I searched the entire capture and it is the only RST, so it doesn't look to have come from the node.
I checked the logs to see if we had encountered a port exhaustion (https://support.f5.com/csp/article/K7820), but there was nothing in the logs. There's also no logs stating that the node/pool has been marked as down.
Does anyone have any thoughts as to why a new connection would be established? There is clearly traffic being sent down the connection, so idle timeout shouldn't be an issue.
Thanks for getting this far 🙂
Cheers, Simon