Long story short: in Linux, how do I ensure that an ACK message is received for a certain TCP packet?
Full story:
I'm debugging an Asterisk/OpenH323 <-> Panasonic IP-GW16 issue.
An H323 connection involves two sessions: H225.0 and H245. These are just two TCP sessions, over which some data are transmitted.
Let's call them Session 1
(for H225.0) and Session 2
(for H245).
Session 1
has well-known TCP port number of 1720, while port for Session 2
is chosen at runtime.
Control flow goes as follows:
- Panasonic calls Asterisk: it opens
Session 1
(TCP/1720) to Asterisk and sends a SETUP message overSession 1
, which contains theport 2
that Panasonic will listen to. - Asterisk sends Panasonic a CALL PROCEEDING message over
Session 1
- Panasonic starts listening on the
port 2
- Panasonic sends a TCP ACK over
Session 1
. - Asterisk opens TCP
Session 2
onport 2
.
Order of steps 2 and 3 is important: Panasonic will not listen to port 2
unless it has received a CALL PROCEEDING message on step 2
.
But in OpenH323 code, step 2
and step 5
are only several lines away.
That's why connection sometimes
works in debug mode and quite never
works in release.
It is clearly seen in a packet dump. I made a series of experiments, and in 52 cases out of 52, if step 5
goes before step 4
, the connection fails; if not, the connection succeeds.
There are no other messages sent from Panasonic except that ACK in step 4
, and it seems that the only way Asterisk may know that port 2
is listened to is receiving that ACK.
Of course I may implement a timed wait, but I want a cleaner solution.
So the question goes again: after sending a message over TCP connection in step 2
, how do I know if an ACK is received to the packet containing the message?
-
In this specific case, I'd say you would find that your
tcp_info
structure would hold a nonzerotcp_info.tcpi_unacked
. You'd get this viagetsockopt(TCP_INFO)
.Note: unstable interface apparently.
-
The timing sequence seems odd, although if the Panasonic is using a proprietary O/S that might explain it.
To clarify - AIUI - if the Panasonic was running a "normal" O/S, the ACK sent by it in stage 4 would happen immediately after the Panasonic's software has
read()
the data from the control TCP socket.Similarly the OpenH323 code call to
write()
(in step 2) shouldn't return (assuming that it's not a non-blocking socket!) until the ACK from the Panasonic has been received by the Asterisk server. That is how you're supposed to know that the ACK has been received.Essentially it seems that the Panasonic isn't doing the equivalent of
listen()
on the second socket until after it hasread()
theCALL PROCEEDING
message. It looks like a race condition - sometimes Open323 will try toconnect()
before the other end is ready.When this happens, do you get
ECONNREFUSED
at the OpenH23 end?Alnitak : ok - this was a bit wrong - I'll edit. The ACK would normally be sent once the _O/S_ has received the packet, it doesn't rely on the application reading() the data.
0 comments:
Post a Comment