COT extraction from RTSP stream

Jindrich · April 20, 2026, 3:42pm

Hi,

I am trying to extract real-time drone position (CoT/telemetry) from a Parrot Anafi (SkyController 3) using the RTSP stream (rtsp://192.168.53.1/live).

Our setup:

Source: RTSP stream from SkyController 3.
Goal: Extract KLV metadata (MISB 0601) to generate CoT (Cursor-on-Target) for ATAK.
Processing: Using FFmpeg to demux the stream and klvdata (Python) for parsing.

The Problem: We have identified the metadata stream (usually labeled as Video: h264, none by FFmpeg), but we are facing several issues:

Dynamic Stream Mapping: The metadata stream ID changes (e.g., from 0:2 to 0:4) between sessions, making it hard to target consistently.
Parser Errors: Even when correctly mapped, standard KLV parsers fail with “index-sized integer” errors or “UnknownElement” attributes. It seems the KLV data is either wrapped in a non-standard way or contains proprietary tags that break the MISB 0601 compliance.
Data Consistency: FFmpeg often reports “dimensions not set” when trying to pipe the metadata stream, even when using -c copy -f data.

Question: Is there a specific bitstream filter or a recommended way to extract the raw KLV packets from the RTSP stream so they are readable by standard MISB parsers? Are there any proprietary headers we need to skip before the Universal Label (06 0e 2b 34...)?

Any guidance on the exact structure of the vmeta packets within the RTSP container would be greatly appreciated.

Best regards,

Jindrich

Hugo · April 21, 2026, 9:18am

Hello @Jindrich ,

First of all, can you confirm which drone you are talking about? You posted this message in the Anafi UKR section and talked about Anafi + SkyController 3.

Regarding your parsing errors, the answer is simple: We do not use standard MISB 0601 KLV. The metadata in the stream is encoded in Protobuf. This is why standard KLV parsers fail.

Here is the recommended approach:

Format: You need to parse the stream using a Protobuf parser: vmeta.proto on GitHub.
Extraction: FFmpeg is unreliable for this specific dynamic stream. We recommend using the PDrAW library to properly extract the H.264 frames and metadata together.
GroundSDK (Android): If you are building an app directly on the controller, GroundSDK does not support extracting encoded frames natively. You will have to use RawVideoSink (which decodes the frames), add your CoT/KLV metadata, and re-encode the video.

Best regards,
Hugo

Jindrich · April 28, 2026, 6:37am

Dear Hugo,

I hope you are doing well. I am writing to follow up on our previous conversation regarding metadata extraction from the Parrot Anafi stream.

I would like to clarify that we are currently working with the Parrot Anafi UKR / XLR running FreeFlight 8 software.

We have made some progress: using the Olympe library, we are successfully able to extract the metadata and the video stream. However, we have hit a significant roadblock. Our target platform is a microcomputer (NanoPi) which is not supported by Olympe (due to its specific architecture and dependencies).

Because of this, we are looking for a way to bypass Olympe and extract the metadata directly from the RTSP stream or via other protocols. Here is what we have tried so far:

We integrated the vmeta.proto library from GitHub into our project.

We are attempting to capture the H.264 stream and look for SEI (Supplemental Enhancement Information) units or linked data streams.

Despite our efforts, we are unable to successfully parse the raw data into a readable format using the generated Protobuf classes. The stream appears to be missing the vmeta tags in the way we expect, or we are misidentifying the data blocks.

Could you please assist us with the following?

Is it possible to extract telemetry/metadata from the Anafi UKR/XLR without using Olympe?

Are the metadata embedded directly in the H.264 SEI NAL units in FreeFlight 8, or is there a separate UDP port (e.g., for MAVLink or a raw stream) we should listen to?

Could you provide any guidance or documentation on how to correctly parse the raw metadata stream for this specific drone model?

Our goal is to create a lightweight bridge that can run on ARM-based microcomputers where the full Olympe stack cannot be installed.

Thank you very much for your time and help.

Best regards,

Jindrich

mathieulm · April 28, 2026, 12:43pm

Hi,

If you need to bypass Olympe and extract metadata directly from the RTSP stream, you can use the libpdraw-vsink library available in the PDrAW sources (GitHub - Parrot-Developers/pdraw: Parrot Drones Audio and Video Vector · GitHub).

This API is easy to use and allows you to retrieve video frames along with their associated metadata. You can refer to this example to understand how to implement it: pdraw_vsink_test.c.

Since your target platform is a microcomputer (NanoPi), you might want to bypass the decoding step (which is CPU-intensive) and request the coded H.264 frames instead. To do this, replace all occurrences of pdraw_raw_video_sink with pdraw_coded_video_sink and mbuf_raw_video_frame with mbuf_coded_video_frame in pdraw_vsink.c. This will allow you to access the coded frames and their metadata without the overhead of decoding them.

Regards,
Mathieu

Jindrich · May 13, 2026, 5:55am

Hi Mathieu,

Thank you for your guidance. I have tried to follow your suggestions regarding libpdraw-vsink, but I am encountering several critical issues that prevent me from moving forward:

Build Issues (Alchemy): When attempting to compile the SDK with pdraw enabled for our Linux target, the build fails due to dependency conflicts in libpdraw-backend-video-metadata. It seems some required components are missing or incompatible in the current Ubuntu-based environment we are using.
File Structure & vmeta: I am attaching a sample video file (vsechno.mp4) recorded from the RTSP stream using ffmpeg.

A mediainfo check shows multiple AVC video streams but no dedicated timed metadata (vmeta) track.
I suspect the metadata is embedded as SEI NAL units (Protobuf) within the H.264 stream itself.
Our current implementation using libmp4 fails to find any metadata tracks (mp4_demux_get_track_count returns no metadata tracks).

Could you please verify the attached file?

Does this specific recording contain the vmeta Protobuf data within the SEI packets?
If so, is there a way to extract and decode this Protobuf data using only libmp4 and futils, or is libpdraw absolutely mandatory for this specific stream format?

Since we are on a constrained platform (NanoPi), we would prefer to avoid the pdraw overhead if the data can be parsed directly from the bitstream.

Thank you for your support.

Best regards
Jindrich
vsechno.mp4.zip (4.0 MB)

remi1 · May 19, 2026, 2:16pm

Hello,

I’m guessing your target is missing some dependencies, can you show me the build error message please?
The provided video doesn’t have any Parrot metadata, so it will not be possible to read any protobuf in it.

Would an implementation of pdraw_coded_video_sink to avoid decoding the video stream, but only extract the protobuf create a problematic overhead on your target?

Regards,
Remi

system · May 20, 2026, 3:43pm

This topic was automatically closed after 30 days. New replies are no longer allowed.