[rt2x00-users] rt2x00: status update regarding txdone handling
Helmut Schaa
helmut.schaa at googlemail.com
Wed Aug 4 08:40:59 UTC 2010
Am Mittwoch 04 August 2010 schrieb Ivo Van Doorn:
> On Wed, Aug 4, 2010 at 10:25 AM, Helmut Schaa
> <helmut.schaa at googlemail.com> wrote:
> > Am Montag 02 August 2010 schrieb Ivo Van Doorn:
> >> On Mon, Aug 2, 2010 at 10:38 AM, Helmut Schaa
> >> <helmut.schaa at googlemail.com> wrote:
> >> > Am Donnerstag 29 Juli 2010 schrieb Ivo Van Doorn:
> >> >> On Thu, Jul 29, 2010 at 6:30 PM, Helmut Schaa
> >> >> <helmut.schaa at googlemail.com> wrote:
> >> >> > Am Donnerstag 29 Juli 2010 schrieb Ivo Van Doorn:
> >> >> >> Hi,
> >> >> >>
> >> >> >> > I'm still fighting with the txdone handling in rt2800pci. Sometimes the tx
> >> >> >> > queues get stuck. It took me a few days but it seems as if the TX_STA_FIFO
> >> >> >> > register limitation to hold max 16 entries is causing this problem.
> >> >> >>
> >> >> >> Weird, for rt2800usb this doesn't seem to be the problem. The queue locks
> >> >> >> up even when I continuously read the TX_STA_FIFO register.
> >> >> >
> >> >> > Maybe I should add that this behaviour only happens when running iperf
> >> >> > from the SoC board to an associated client. And the host CPU utilization
> >> >> > is near to 100%.
> >> >>
> >> >> Ah ok, well I haven't managed to get iperf working on my system yet,
> >> >> so I can't test that. But since the queue locks up regardless of CPU and
> >> >> TX_STA_FIFO status, it might be a different problem (perhaps even for SoC).
> >> >
> >> > Just for the records, since I use OpenWrt for development I cannot simply run
> >> > rt2x00 git on the board, hence I normally use the current compat-wireless +
> >> > some rt2x00 patches (if any).
> >>
> >> Well I use the compat-wireless package as well, I really don't want to reinstall
> >> the kernel for each update. :)
> >>
> >> > The funny thing is: I can easily trigger the stuck tx queue problem with
> >> > 2.6.34 + compat-wireless whereas it seems to not happen with 2.6.32 +
> >> > compat-wireless.
> >> >
> >> > I've just double checked but also on 2.6.32 the TX_STA_FIFO contains sometimes
> >> > 16 entries. So maybe my conclusion regarding the TX_STA_FIFO overflow is
> >> > incorrect ... Not sure though. I can also see some non-freed entries in the
> >> > tx queue when using 2.6.32 but I cannot make it stuck completely.
> >> >
> >> > /me is confused now.
> >>
> >> Well not me. :P I was doubting about the TX_STA_FIFO explanation anyway, since
> >> it would mean that rt2800pci and rt2800usb suffer from the same queue lockup,
> >> but with apparent different causes. Now that it seems that the
> >> rt2800pci queue lockup
> >> isn't caused by TX_STA_FIFO overflow, it might be the same bug as
> >> rt2800usb. Which
> >> actually is nice, since if this issue is fixed, it is fixed for all hardware. :)
> >>
> >> Perhaps we should check the TX(WI) descriptor, I understood from Ralink that
> >> the HW queue handler might lockup when it finds an unexpected value. At least in
> >> rt2800usb the values for TXINFO_W0_USB_DMA_NEXT_VALID and
> >> TXINFO_W0_SW_USE_LAST_ROUND could cause problems, although rt2800usb
> >> does send the correct values (always 0). But maybe there is a wrong
> >> TXWI field....
> >
> > Ok, I found some time to investigate a little bit further. I used my patches
> > to read the TX_STA_FIFO from hard irq context and processing it in the
> > interrupt thread, this should reduce the average number of tx status read
> > from the register. Second, I now used all DMA_DONE and the TX_DONE interrupts
> > for reading the TX_STA_FIFO.
> >
> > The most notable difference is that with these changes the read of TX_STA_FIFO
> > never ever comes close to 16 status reads which means there shouldn't happen an
> > overflow at all. However, even in that case it seems as if the tx status of
> > some frames get lost :(. In short, the number of tx status reads from
> > TX_STA_FIFO is smaller then the number of tx'ed frames which again leaves some
> > frames remaining in the tx queue.
> >
> > So, I'm trying to find the answers to these questions now:
> >
> > - Can a TX_STA_FIFO overflow happen at all?
>
> I think not.
I know ;) but I'm not 100% convinced. The question is: what is the hw going to
do when you send more then 16 frames before reading the TX_STA_FIFO? Will it
stop sending out frames once it reaches 16? Hmm, that could be found out with
some driver hacking I guess ...
> > - Does every frame get a tx status? Or in which cases might a tx status get
> > lost and why? Is that easily detectable?
>
> Well I think I mentioned a couple of times before, but this issue is
> the same as rt61pci,
> frames are being lost in there as well.
Ok, thanks for that info, I wasn't sure if rt61pci is suffering from that issue
as well. Do you know any details how to trigger that issue on rt61pci? At least
on rt2800pci on a SoC it is easily triggered by saturating the connection
and/or the CPU (I'm not sure which of the both is more important).
> If we can map a TX_STA_FIFO
> entry correctly
> to a particular entry in the queue, then we can report all missed
> entries with the
> UNKNOWN state. There is hardly any alternative to this approach...
Agreed. I already have a vague idea on how we could achieve something like
that. Let me play around a bit first. As soon as I've got something suitable
(or a failure) I'll post another update.
> The reason for the loss of TX status reports is unknown, and Ralink
> couldn't give any
> answers on this issue either.
Too bad ;(
Thanks,
Helmut
More information about the users
mailing list