"Connection reset by peer" while flying streaming video

I’m having a ground device (RPI 4) using the Wifi of the drone. The drone streams video. I’m controlling the flight from the PI.

Both devices - the PI and the drone - are apart by max 3 m air. However, the connection seems to be very instable: The video shown on the PI is full of artefacts, medium latency, bad quality (using the YUV callback on the PI).

And then out of the sudden:

022-08-11 11:14:49,500 [ERROR]         ulog - pomp - read(fd=26) err=104(Connection reset by peer)
2022-08-11 11:14:50,378 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:51,379 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:51,503 [ERROR]         olympe.pdraw.ANAFI-G134295 - destroy - Pdraw.destroy() timedout
2022-08-11 11:14:52,379 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:53,382 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:54,380 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:54,505 [ERROR]         olympe.pdraw.ANAFI-G134295 - _video_sink_flush_impl - video sink flush id 3 timeout
2022-08-11 11:14:54,513 [ERROR]         ulog - rtsp_client - rtsp_client_pomp_timer_cb:274: send_keep_alive err=16(Device or resource busy)
2022-08-11 11:14:54,515 [ERROR]         ulog - pomp - idle entry cb=0x7fb29e2fc0 userdata=0x7f78002760 still in the loop
2022-08-11 11:14:54,515 [ERROR]         ulog - pomp - idle entry cb=0x7fb29e2fc0 userdata=0x7f78002760 still in the loop
2022-08-11 11:14:54,516 [ERROR]         ulog - pomp - idle entry cb=0x7fb29fe950 userdata=0x7f78001380 still in the loop
2022-08-11 11:14:54,516 [ERROR]         ulog - pomp - idle entry cb=0x7fb29e2fc0 userdata=0x7f78002760 still in the loop
2022-08-11 11:14:54,516 [ERROR]         ulog - pomp - idle entry cb=0x7fb29df330 userdata=0x7f78002760 still in the loop
2022-08-11 11:14:54,516 [ERROR]         ulog - pomp - fd=30, cb=0x7fb2dc8070 still in loop
2022-08-11 11:14:54,516 [ERROR]         ulog - pomp - fd=46, cb=0x7fb2dc8070 still in loop
2022-08-11 11:14:54,517 [ERROR]         ulog - pomp - fd=50, cb=0x7fb29de190 still in loop
2022-08-11 11:14:54,517 [ERROR]         ulog - pomp - fd=48, cb=0x7fb2dc8070 still in loop
2022-08-11 11:14:54,518 [ERROR]         ulog - pomp - fd=54, cb=0x7fb26687c0 still in loop
2022-08-11 11:14:54,518 [ERROR]         ulog - pomp - fd=52, cb=0x7fb2dc8070 still in loop
2022-08-11 11:14:54,518 [ERROR]         ulog - pomp - fd=56, cb=0x7fb2dc4660 still in loop
2022-08-11 11:14:54,518 [ERROR]         ulog - pomp - fd=47, cb=0x7fb2dbcf70 still in loop
2022-08-11 11:14:54,518 [ERROR]         ulog - pomp - fd=45, cb=0x7fb2dc8070 still in loop
2022-08-11 11:14:54,518 [ERROR]         ulog - pomp - fd=49, cb=0x7fb29ddf90 still in loop
2022-08-11 11:14:54,518 [ERROR]         olympe.pdraw.ANAFI-G134295 - _destroy_pomp_loop - Error while destroying pomp loop: -16
2022-08-11 11:14:55,381 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:56,383 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:14:57,382 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??

Q: Is this “connection reset by the peer” what I think? A loss of connectivity between drone and PI?

Bad thing: The local video stream problems on the PI block everything. While the PI before that situation perfectly did show lat, lng and alt, these functions return nothing anymore. I do not even have a chance to fire a “land” command. I need to restart my script in order to land.

Please elaborate.

EDIT: Also observed:

022-08-11 11:12:22,433 [INFO] pltest.py: state: Hovering, lat: MYLAT, lng: MYLNG, alt: MYALT
2022-08-11 11:12:23,434 [INFO] pltest.py: state: Hovering, lat: MYLAT, lng: MYLNG, alt: MYALT
2022-08-11 11:12:24,437 [INFO] pltest.py: state: Hovering, lat: MYLAT, lng: MYLNG, alt: MYALT
2022-08-11 11:12:24,983 [ERROR]         ulog - pomp - read(fd=26) err=104(Connection reset by peer)
2022-08-11 11:12:24,987 [ERROR]         ulog - pdraw_dmxstrm - stop:1168: StreamDemuxerNet#1: rtsp_client_teardown err=16(Device or resource busy)
2022-08-11 11:12:24,989 [ERROR]         ulog - rtsp_client - request_complete: session not found
2022-08-11 11:12:24,990 [ERROR]         ulog - rtsp_client - rtsp_client_pomp_event_cb:736: request_complete err=2(No such file or directory)
2022-08-11 11:12:24,994 [ERROR]         ulog - pdraw_dmxstrm - idleRtspDisconnect:1034: StreamDemuxerNet#1: rtsp_client_disconnect err=71(Protocol error)
2022-08-11 11:12:25,434 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:26,434 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:26,988 [ERROR]         olympe.pdraw.ANAFI-G134295 - destroy - Pdraw.destroy() timedout
2022-08-11 11:12:27,438 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:28,435 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:29,436 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:29,998 [ERROR]         olympe.pdraw.ANAFI-G134295 - _video_sink_flush_impl - video sink flush id 3 timeout
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - idle entry cb=0x7f99680fc0 userdata=0x7f5c002540 still in the loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - idle entry cb=0x7f9969c950 userdata=0x7f5c001160 still in the loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - idle entry cb=0x7f99680fc0 userdata=0x7f5c002540 still in the loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - idle entry cb=0x7f99680fc0 userdata=0x7f5c002540 still in the loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - fd=46, cb=0x7f99a66070 still in loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - fd=50, cb=0x7f9967bf90 still in loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - fd=48, cb=0x7f99a66070 still in loop
2022-08-11 11:12:30,022 [ERROR]         ulog - pomp - fd=54, cb=0x7f993067c0 still in loop
2022-08-11 11:12:30,023 [ERROR]         ulog - pomp - fd=56, cb=0x7f99a62660 still in loop
2022-08-11 11:12:30,023 [ERROR]         ulog - pomp - fd=45, cb=0x7f99a66070 still in loop
2022-08-11 11:12:30,023 [ERROR]         ulog - pomp - fd=51, cb=0x7f9967c190 still in loop
2022-08-11 11:12:30,023 [ERROR]         ulog - pomp - fd=49, cb=0x7f99a66070 still in loop
2022-08-11 11:12:30,023 [ERROR]         ulog - pomp - fd=53, cb=0x7f99a66070 still in loop
2022-08-11 11:12:30,023 [ERROR]         olympe.pdraw.ANAFI-G134295 - _destroy_pomp_loop - Error while destroying pomp loop: -16
2022-08-11 11:12:30,438 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:31,438 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??
2022-08-11 11:12:32,439 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??

Hi,

From your incomplete log it seems reasonable to think that the Pi has completely lost the connectivity to the drone. The ulog - pomp - read(fd=26) err=104(Connection reset by peer) error message is related to the streaming connection but I suppose that you must have different logs entry mentioning the lost of the SDK connection.

When Olympe gets disconnected from the drone it reset its internal copy of the remote drone state. In this case, the drone.get_state() method raises an exception (that I suppose you’re catching in your code where you print the “??” instead of the latitude/longitude of the drone).

Incomplete? Well, I have reduced the olympe log level to WARNING, is that what you mean by “incomplete”?

I will try to reproduce by not supressing the exception in the method which get’s the state and report

EDIT: I found that I do not supress any exception. I just “ask” before getting the state (I learned, that your SDK is loving to fire exceptions and I wasn’t always well prepared for this :))

    def get_state_safe(self, state):
        ''' Survive possible exception (e.g. when value is not available yet). Ask before fetch '''
        if not self.drone.connection_state():
            return None
        if self.drone.check_state(state):
            return self.drone.get_state(state)
        return None

So I trace ?? if the result is None.

OK, here we have a complete log of this situation. I set the olympe log level to INFO.

I had the drone aside of me on the desk, near the PI. Then I took the drone and relocated it to the darkest corner of the basement. The death of the connection can be seen beautifully. After the connection is lost my get_state_safe returns None, which seems to be triggered by the negative self.drone.connection_state().

All ok for me, except the various

2022-09-07 14:59:52,505 [ERROR] pltest.py: ignoring erroneous or incomplete frame

This is the visual expression of an issue reported by Daniel. Those frames, not carrying meta data, are not only happening once in the beginning. They seem to be an expression of a garbage frame.

I stupidly just catch the expression w/o tracing it, but when I traced it it was always because the expected property wasn’t found on the empty vmeta object. But this is a different story. Just that you see, that it also does happen IN BETWEEN, and most likely on bad rx conditions.

Even if the drone comes back “in sight”, the old abandoned drone object doesn’t give any useable status and control anymore. I suppose I will have to re-instantiate/re-connect that. That’s ok so far.

What is NOT ok is a cross-reference to this issue:

If I try to terminate my script by CTRL_C I run this sequence:

    def handle_sigint(self, signal, frame):
        ''' CTRL_C handler '''
        self.close()

    def close(self):
        try:
            if self.stat_timer:
                self.stat_timer.stop()
            self.stop_streaming()
            self.drone.disconnect()
        except:
            pass        
        sys.exit(0)        

The final “sys.exit(0)” is blocked by the SDK which is somehow hanging while attempting to acquire an internal lock. An additional “CTRL_C” is required then to kill this too.

2022-09-07 15:18:42,138 [INFO] pltest.py: state: ??, lat: ??, lng: ??, alt: ??, gd: ??
2022-09-07 15:18:43,044 [INFO] pltest.py: stop_streaming: stop streaming
2022-09-07 15:18:43,045 [INFO] 	olympe.drone.ANAFI-G134295 - disconnect - Disconnection with the device OK. IP: b'192.168.42.1'
2022-09-07 15:18:43,070 [INFO] 	olympe.drone.ANAFI-G134295 - _on_device_removed - <olympe.arsdkng.cmd_itf.DisconnectedEvent object at 0x7f6cda4b50>
2022-09-07 15:18:43,071 [INFO] 	olympe.media - _shutdown - olympe.media shutdown
2022-09-07 15:18:43,073 [INFO] 	olympe.scheduler - _destroy_pomp_loop - Pomp loop has been destroyed: subscribers_thread
2022-09-07 15:18:43,372 [INFO] 	olympe.media - _destroy_pomp_loop - Pomp loop has been destroyed: Thread-4
2022-09-07 15:18:43,374 [INFO] 	olympe.backend - _destroy_pomp_loop - Pomp loop has been destroyed: Thread-3
2022-09-07 15:19:06,286 [INFO] pltest.py: stop_streaming: stop streaming
Exception ignored in: <module 'threading' from '/home/pi/code/parrot-groundsdk/out/olympe-linux/pyenv_root/versions/3.9.5/lib/python3.9/threading.py'>
Traceback (most recent call last):
2022-09-07 15:19:06,287 [INFO] 	olympe.drone.ANAFI-G134295 - disconnect - Disconnection with the device OK. IP: b'192.168.42.1'
  File "/home/pi/code/parrot-groundsdk/out/olympe-linux/pyenv_root/versions/3.9.5/lib/python3.9/threading.py", line 1428, in _shutdown
    lock.acquire()
  File "/home/pi/anafi-pi-src/pltest.py", line 468, in handle_sigint
    self.close()
  File "/home/pi/anafi-pi-src/pltest.py", line 478, in close
    sys.exit(0)        
SystemExit: 0

I cannot post the full log of the entire case and put it on dropbox. I have obfuscated the geo-location. As you can see, no exception on connection loss.

https://www.dropbox.com/s/bifhm2vcccwlymi/untitled.txt?dl=1

Maybe I could check the connection state and NOT issue the stream disconnection and drone disconnection in order to provoke this. This is only my test app. The real app would already be in a routine, which would try to re-establish the connection. So probably a no issue.

This time I didn’t see the connection reset by peer, which was the trigger for this question.

So what maybe remains are the

  • invalid/incomplete vmeta data in the frame callback
  • the blockade of the sys.exit(0) on a dead drone object

I reviewed my code in self.stop_streaming(). It looks like as if I’m responsible for the thread hang, since I didn’t terminate the yuv thread (the entire procedure was framed with “if self.crone.connection_state()”. In case of a disconnected drone this renders to false and the thread wasn’t therminated). So we can bury this part of the issue.

The incomplete vmeta issue remains.

This little test script boils the vmeta issue down to a simple thing:

import olympe
import sys
import signal
import time
import queue
import threading

class Test:
    def __init__(self):
      signal.signal(signal.SIGINT, self.handle_sigint)

      self.drone = olympe.Drone("192.168.42.1")
      self.drone.connect()

      if self.drone.connection_state():
        self.drone.streaming.set_callbacks(
                  raw_cb=self.yuv_frame_cb,
                  start_cb=self.yuv_start_cb,
                  end_cb=self.yuv_end_cb,
                  flush_raw_cb=self.yuv_flush_cb,
              )

        self.processing_thread = threading.Thread(target=self.yuv_frame_processing)
        self.drone.streaming.start()
        self.frame_queue = queue.Queue()
        self.running = True
        self.processing_thread.start()

    def yuv_frame_cb(self, yuv_frame):
      yuv_frame.ref()
      self.frame_queue.put_nowait(yuv_frame)

    def yuv_start_cb(self):
      pass
        
    def yuv_end_cb(self):
      pass
    
    def yuv_flush_cb(self, stream):
      if stream["vdef_format"] != olympe.VDEF_I420:
        return True
      while not self.frame_queue.empty():
        self.frame_queue.get_nowait().unref()
      return True


    def yuv_frame_processing(self):
      while self.running:
        try:
            yuv_frame = self.frame_queue.get(timeout=0.1)
            print(yuv_frame.vmeta()[1]["drone"])
        except queue.Empty:
            continue
        yuv_frame.unref()

    def handle_sigint(self, signal, frame):
      self.running = False
      self.processing_thread.join()
      assert(self.drone.streaming.stop())
      assert(self.drone.disconnect())
      sys.exit(0)

    def run(self):
      while True:
        time.sleep(0.1)
  


test = Test()
test.run()


If you run it and either disconnect the drone after being connected or move the drone out of reach, then you will see one or two issues:

  1. The final assert(self.drone.streaming.stop()) fail with

AttributeError: ‘NoneType’ object has no attribute ‘stop’

This is most likely because the drone is already internally disconnected.

  1. If you slowly move the drone out of reception you will see traces like this:
Exception in thread Thread-6:
Traceback (most recent call last):
  File "/home/pi/code/parrot-groundsdk/out/olympe-linux/pyenv_root/versions/3.9.5/lib/python3.9/threading.py", line 954, in _bootstrap_inner
    self.run()
  File "/home/pi/code/parrot-groundsdk/out/olympe-linux/pyenv_root/versions/3.9.5/lib/python3.9/threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "/home/pi/anafi-pi-src/test.py", line 51, in yuv_frame_processing
    print(yuv_frame.vmeta()[1]["drone"])
KeyError: 'drone'

This is the empty vmeta object which needs to be explained.

This topic was automatically closed after 30 days. New replies are no longer allowed.