-
Notifications
You must be signed in to change notification settings - Fork 14.8k
drivers/gps: prioritize non-blocking reads over injection #25535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dakejahl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely an improvmement. The handleInjectDataTopic function can still block for a while though. If it loops the max (8) times and injects 300 bytes each time at 115200 baud that results in ~200ms blocking. At 921600 it's only 26ms and is probably goes unnoticed.
|
Not sure this fixes the situation as much as necessary- As jake pointed out, it would definitely make sure that on entry, IF GPS bytes are available it will service them first, but I think the parsing happens after this call, so we'd still get blocked if the inject takes a long time. I think the fundamental problem is that in the duration of time it takes to read in 8 rtcm messages, the 600 byte hardware buffer (v5x) could have overflowed several times, so next loop even with data available in the buffer, its not the continuation of the last time we read it because the receive buffer overflowed. |
|
I think the correct fix to this problem is to fully separate the read and write threads so that the read task will NEVER block, and the write task will write as much data as possible. |
|
Maybe we can break off the RTCM injection into a SubscriptionCallback work item. Is the driver doing any other writes aside from RTCM injection during normal run time? |
|
@dakejahl, If we really do not want to be blocked by the |
|
Another thread is probably the easiest way to accomplish the job but not strictly necessary. The other option is to interleave reads/writes and only write small chunks at a time. We also would need to buffer the RTCM uORB data so that we can chunk it out at our leisure. |
|
I'm not 100% sure how the EKF handles the GPS timing, but I believe there is a parameter that adjusts the expected delay of the GPS data vs the EKF time horizon. If we are varying the timing baesd on injection and blocking reads while we do, the GPS data may come in at variable timing which is probably subpar for the EKF. |
|
It seems silly that we have a full duplex serial port but can't actually use it that way, So I'd support something that properly parallelizes the process. Perhaps the GPS reading thread can operate as normal, and the writing can happen on a work queue (so it isn't a full extra thread?) |
|
The EKF time horizon is ~100ms so GPS read jitter shouldn't have an impact as long as we stay well below that. A work queue is still a full thread but we might be able to get away with putting it on the hp_default, but I would probably be safe and just give it it's own thread. @dagar what do you think? Also, who wants to fix this? I'm pretty busy at the moment but am interested in seeing this solved quickly. @jeremyzff want to take a stab at it? |
|
the work queue is a full thread that is more or less already "paid for" right? so adding it to a work queue adds minimal overhead was my thought. But a separate thread might be ok too. I'm sensitive to the RAM usage because our drones are starting to get pretty full... |
|
regarding read jitter- a burst of 8 messages of full 180 bytes at 115200 baud GPS rate is 125ms, which means that is how much variation I might have in when I read the GPS data between when there is no injection or max injection... Am I thinking about this right? so if it already takes 100ms, it could sometimes take 100ms, and other times take 225ms? |
|
A work queue is a thread that can be shared by multiple work items. But these work items run sequentially in the thread context. So all items in the queue are blocked by the slowest running item they share it with. If we don't use another thread and instead interleave non-blocking read/writes there won't really be any jitter since the reads/write are non-blocking. We just need to make sure we're checking the return value of read/write to ensure we don't quietly drop data. The output data (RTCM) needs to be buffered so that if we try to write 200 bytes but instead only write 50, we still have those 150 waiting to be written. |
|
we've tested this, and its part of the overall solution for us. By itself, this didn't fix our problem because the injection still took same amount of time and the RX buffer would overflow. But with the reversion of the fsync->tcdrain patch also which made writes nonblocking again, this still makes sure that we process a lot of the GPS buffer if its got backed up while we were away. |
No description provided.