Hi,
We are using pjsip 2.9 with asterisk 13.32.0, using webrtc transport with 'rel100' activated. There are about 170 SIP endpoints connected and 150 simultaneous calls.
We get a crash last week and it wasn't reproduced yet. Segfault thread stack is:
#0 pjsip_tx_data_add_ref (tdata=0x0) at ../src/pjsip/sip_transport.c:512
#1 0x00007f8bfd51ee12 in on_retransmit (timer_heap=<optimized out>, entry=0x7f8b9437d748) at ../src/pjsip-ua/sip_100rel.c:599
#2 0x00007f8bfd5d2fa7 in pj_timer_heap_poll (ht=0x36f0850, next_delay=next_delay@entry=0x7f8beb697ce0) at ../src/pj/timer.c:659
#3 0x00007f8bfd536dad in pjsip_endpt_handle_events2 (endpt=0x36f0568, max_timeout=max_timeout@entry=0x7f8beb697d40, p_count=p_count@entry=0x0) at ../src/pjsip/sip_endpoint.c:716
#4 0x00007f8bfd536ec7 in pjsip_endpt_handle_events (endpt=<optimized out>, max_timeout=max_timeout@entry=0x7f8beb697d40) at ../src/pjsip/sip_endpoint.c:777
#5 0x00007f8b877a6f30 in monitor_thread_exec (endpt=<optimized out>) at res_pjsip.c:4465
#6 0x00007f8bfd5bc000 in thread_main (param=0x379a3a8) at ../src/pj/os_core_unix.c:541
#7 0x00007f8bfb609e65 in start_thread () from /usr/lib64/libpthread.so.0
#8 0x00007f8bfa9ab88d in clone () from /usr/lib64/libc.so.6
Seems its related with timers and/or rel100/PRACK.
We attach additional information.
We ensured that related timer fixes (#2230 and #2172) were applied.
Additionally, we think the issue could be related to network latencies / problems, because some end points are connected to WIFI networks.
¿Does anyone know if it's a known issue?
¿Can anyone help us?
Regards,
Hi,
Seems #2350 could fix the crash (https://github.com/pjsip/pjproject/pull/2350)
#2350 has been applied on master (2.11). Does anyone know if apply fix# 2350 on pjsip 2.9 is safe?
Regards.
De: Josep Bort
Enviado el: jueves, 2 de abril de 2020 12:44
Para: 'pjsip@lists.pjsip.org' pjsip@lists.pjsip.org
Asunto: Pjsip 2.9 crash in pjsip_tx_data_add_ref
Hi,
We are using pjsip 2.9 with asterisk 13.32.0, using webrtc transport with 'rel100' activated. There are about 170 SIP endpoints connected and 150 simultaneous calls.
We get a crash last week and it wasn't reproduced yet. Segfault thread stack is:
#0 pjsip_tx_data_add_ref (tdata=0x0) at ../src/pjsip/sip_transport.c:512
#1 0x00007f8bfd51ee12 in on_retransmit (timer_heap=<optimized out>, entry=0x7f8b9437d748) at ../src/pjsip-ua/sip_100rel.c:599
#2 0x00007f8bfd5d2fa7 in pj_timer_heap_poll (ht=0x36f0850, next_delay=next_delay@entry=0x7f8beb697ce0) at ../src/pj/timer.c:659
#3 0x00007f8bfd536dad in pjsip_endpt_handle_events2 (endpt=0x36f0568, max_timeout=max_timeout@entry=0x7f8beb697d40, p_count=p_count@entry=0x0) at ../src/pjsip/sip_endpoint.c:716
#4 0x00007f8bfd536ec7 in pjsip_endpt_handle_events (endpt=<optimized out>, max_timeout=max_timeout@entry=0x7f8beb697d40) at ../src/pjsip/sip_endpoint.c:777
#5 0x00007f8b877a6f30 in monitor_thread_exec (endpt=<optimized out>) at res_pjsip.c:4465
#6 0x00007f8bfd5bc000 in thread_main (param=0x379a3a8) at ../src/pj/os_core_unix.c:541
#7 0x00007f8bfb609e65 in start_thread () from /usr/lib64/libpthread.so.0
#8 0x00007f8bfa9ab88d in clone () from /usr/lib64/libc.so.6
Seems its related with timers and/or rel100/PRACK.
We attach additional information.
We ensured that related timer fixes (#2230 and #2172) were applied.
Additionally, we think the issue could be related to network latencies / problems, because some end points are connected to WIFI networks.
¿Does anyone know if it's a known issue?
¿Can anyone help us?
Regards,