COT extraction from RTSP stream

Hi,

I am trying to extract real-time drone position (CoT/telemetry) from a Parrot Anafi (SkyController 3) using the RTSP stream (rtsp://192.168.53.1/live).

Our setup:

  • Source: RTSP stream from SkyController 3.
  • Goal: Extract KLV metadata (MISB 0601) to generate CoT (Cursor-on-Target) for ATAK.
  • Processing: Using FFmpeg to demux the stream and klvdata (Python) for parsing.

The Problem: We have identified the metadata stream (usually labeled as Video: h264, none by FFmpeg), but we are facing several issues:

  1. Dynamic Stream Mapping: The metadata stream ID changes (e.g., from 0:2 to 0:4) between sessions, making it hard to target consistently.
  2. Parser Errors: Even when correctly mapped, standard KLV parsers fail with “index-sized integer” errors or “UnknownElement” attributes. It seems the KLV data is either wrapped in a non-standard way or contains proprietary tags that break the MISB 0601 compliance.
  3. Data Consistency: FFmpeg often reports “dimensions not set” when trying to pipe the metadata stream, even when using -c copy -f data.

Question: Is there a specific bitstream filter or a recommended way to extract the raw KLV packets from the RTSP stream so they are readable by standard MISB parsers? Are there any proprietary headers we need to skip before the Universal Label (06 0e 2b 34...)?

Any guidance on the exact structure of the vmeta packets within the RTSP container would be greatly appreciated.

Best regards,

Jindrich

Hello @Jindrich ,

First of all, can you confirm which drone you are talking about? You posted this message in the Anafi UKR section and talked about Anafi + SkyController 3.

Regarding your parsing errors, the answer is simple: We do not use standard MISB 0601 KLV. The metadata in the stream is encoded in Protobuf. This is why standard KLV parsers fail.

Here is the recommended approach:

  1. Format: You need to parse the stream using a Protobuf parser: vmeta.proto on GitHub.
  2. Extraction: FFmpeg is unreliable for this specific dynamic stream. We recommend using the PDrAW library to properly extract the H.264 frames and metadata together.
  3. GroundSDK (Android): If you are building an app directly on the controller, GroundSDK does not support extracting encoded frames natively. You will have to use RawVideoSink (which decodes the frames), add your CoT/KLV metadata, and re-encode the video.

Best regards,
Hugo

Dear Hugo,

I hope you are doing well. I am writing to follow up on our previous conversation regarding metadata extraction from the Parrot Anafi stream.

I would like to clarify that we are currently working with the Parrot Anafi UKR / XLR running FreeFlight 8 software.

We have made some progress: using the Olympe library, we are successfully able to extract the metadata and the video stream. However, we have hit a significant roadblock. Our target platform is a microcomputer (NanoPi) which is not supported by Olympe (due to its specific architecture and dependencies).

Because of this, we are looking for a way to bypass Olympe and extract the metadata directly from the RTSP stream or via other protocols. Here is what we have tried so far:

We integrated the vmeta.proto library from GitHub into our project.

We are attempting to capture the H.264 stream and look for SEI (Supplemental Enhancement Information) units or linked data streams.

Despite our efforts, we are unable to successfully parse the raw data into a readable format using the generated Protobuf classes. The stream appears to be missing the vmeta tags in the way we expect, or we are misidentifying the data blocks.

Could you please assist us with the following?

Is it possible to extract telemetry/metadata from the Anafi UKR/XLR without using Olympe?

Are the metadata embedded directly in the H.264 SEI NAL units in FreeFlight 8, or is there a separate UDP port (e.g., for MAVLink or a raw stream) we should listen to?

Could you provide any guidance or documentation on how to correctly parse the raw metadata stream for this specific drone model?

Our goal is to create a lightweight bridge that can run on ARM-based microcomputers where the full Olympe stack cannot be installed.

Thank you very much for your time and help.

Best regards,

Jindrich

Hi,

If you need to bypass Olympe and extract metadata directly from the RTSP stream, you can use the libpdraw-vsink library available in the PDrAW sources (GitHub - Parrot-Developers/pdraw: Parrot Drones Audio and Video Vector · GitHub).

This API is easy to use and allows you to retrieve video frames along with their associated metadata. You can refer to this example to understand how to implement it: pdraw_vsink_test.c.

Since your target platform is a microcomputer (NanoPi), you might want to bypass the decoding step (which is CPU-intensive) and request the coded H.264 frames instead. To do this, replace all occurrences of pdraw_raw_video_sink with pdraw_coded_video_sink and mbuf_raw_video_frame with mbuf_coded_video_frame in pdraw_vsink.c. This will allow you to access the coded frames and their metadata without the overhead of decoding them.

Regards,
Mathieu