rt2800usb stops at random

Live forum: http://rt2x00.serialmonkey.com/viewtopic.php?t=6210

noranuko

24-05-2012 00:14:37

Hi,

I have a WG-G300U(Railnk rt3070) usb dongle.
It is used by hostapd on ubuntu11.04 of beagleboad-xm.

However, it may not be connectable although SSID can be found.
It often occurs that can't connect to hostapd's AP although SSID can be found.

I attempted to change it into a different machine,ubunt12.04 on x86-pc .
But,it happens.
I checked a driver, it seems that RX/TX Handler is not operating.

When this phenomenon occurs, the following procedures may cure.

/etc/init.d/hostapd stop
ifup wlan0
/etc/init.d/hostapd start

Although timing and Locke feel me dubiously, I cannot specify somewhere.

# lsmod
Module Size Used by
arc4 1217 2
rt2800usb 10405 0
rt2800lib 41459 1 rt2800usb
crc_ccitt 1487 1 rt2800lib
rt2x00usb 9793 1 rt2800usb
rt2x00lib 45690 3 rt2800usb,rt2800lib,rt2x00usb
mac80211 250219 3 rt2800lib,rt2x00usb,rt2x00lib
cfg80211 151013 2 rt2x00lib,mac80211
smsc95xx 12479 0
rtc_twl 4500 0
rtc_ds1307 6630 0
rtc_core 19784 2 rtc_twl,rtc_ds1307
twl4030_madc_hwmon 2620 0
gpio_keys 6160 0

# hostapd.conf
interface=wlan0
driver=nl80211
logger_syslog=-1
logger_syslog_level=1
logger_stdout=-1
logger_stdout_level=1
dump_file=/tmp/hostapd.dump
ctrl_interface=/var/run/hostapd
ctrl_interface_group=0
ssid=000test
country_code=JP
ieee80211d=1
hw_mode=g
channel=11
beacon_int=110
dtim_period=1
max_num_sta=256
rts_threshold=2347
fragm_threshold=2346
preamble=1
macaddr_acl=0
auth_algs=3
ignore_broadcast_ssid=0
wmm_enabled=1
wmm_ac_bk_cwmin=4
wmm_ac_bk_cwmax=10
wmm_ac_bk_aifs=7
wmm_ac_bk_txop_limit=0
wmm_ac_bk_acm=0
wmm_ac_be_aifs=3
wmm_ac_be_cwmin=4
wmm_ac_be_cwmax=10
wmm_ac_be_txop_limit=0
wmm_ac_be_acm=0
wmm_ac_vi_aifs=2
wmm_ac_vi_cwmin=3
wmm_ac_vi_cwmax=4
wmm_ac_vi_txop_limit=94
wmm_ac_vi_acm=0
wmm_ac_vo_aifs=2
wmm_ac_vo_cwmin=2
wmm_ac_vo_cwmax=3
wmm_ac_vo_txop_limit=47
wmm_ac_vo_acm=0
ieee80211n=1
eapol_key_index_workaround=0
own_ip_addr=172.16.0.1

Regards,

noranuko

25-05-2012 02:54:39

Ií? debugging source cord.

An ENTRY_DATA_STATUS_PENDING flag seems to have caused mismatching when I debug a source code.
The ENTRY_DATA_STATUS_PENDING flag may not be set bit in rt2x00 usb_work_rxdone.

noranuko

25-05-2012 10:03:20

I'm cahnged rt2x00usb.c,it seems good.
Since this change seems to carry out the stole when ENTRY_OWNER_DEVICE_DATA and ENTRY_DATA_STATUS_PENDING are 0,
it is trying to clear queue_entry which became such.

This state seems to sometimes be caused in the case of initialization.
But, I could not understand where.

static void rt2x00usb_work_rxdone(struct work_struct *work)
{
struct rt2x00_dev *rt2x00dev =
container_of(work, struct rt2x00_dev, rxdone_work);
struct queue_entry *entry;
struct skb_frame_desc *skbdesc;
u8 rxd[32];

while (!rt2x00queue_empty(rt2x00dev->rx)) {
entry = rt2x00queue_get_entry(rt2x00dev->rx, Q_INDEX_DONE);
+ if (!test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags) &&
+ !test_bit(ENTRY_DATA_STATUS_PENDING, &entry->flags))
+ {
+ rt2x00usb_clear_entry(entry);
+ queue_work(rt2x00dev->workqueue, &rt2x00dev->rxdone_work);
+ break;
}

if (test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags) ||
!test_bit(ENTRY_DATA_STATUS_PENDING, &entry->flags))
break;

/*
* Fill in desc fields of the skb descriptor
*/
skbdesc = get_skb_frame_desc(entry->skb);
skbdesc->desc = rxd;
skbdesc->desc_len = entry->queue->desc_size;

/*
* Send the frame to rt2x00lib for further processing.
*/
rt2x00lib_rxdone(entry, GFP_KERNEL);
}
}

noranuko

25-05-2012 11:42:37

This phenomenon seems to happen in the following cases.

1.Much rxdone_work is registered into work_queue by rt2x00_usb_kick_queue of initialization processing.
2.Before all rxdone_work is completed, new rxdone_work starts.
3.queue_entry competes and entry which has not been ended remains in queue.
4.Since top of queue is not removed, a stole is carried out.

Fundamental solution has not been performed although I deleted entry used as an error.
Please asks you for correction.

dmonner

29-05-2012 15:01:50

I would like to add that I'm experiencing what I believe to be the same problem. I'm using a Linksys WMP600N, which uses the following modules rt2800pci, rt2800lib, rt2x00pci, rt2x00lib. I have only noticed this problem since upgrading to Ubuntu 12.04, which uses a 3.2.0-24 kernel.

The problem is that the wireless connection will often simply stop transmitting all data. This happens WITHOUT the connection dropping. Indeed, disconnecting and reconnecting to the AP is the only way to fix the problem, but it needs to be done manually. These interruptions in the active connection seem to happen most often when I'm doing something "bursty" with the connection, like starting to download a large file from a fast server, or testing my connection on SpeedTest.net (in fact, this last one triggers it almost every time).

I'm replying to this thread because it sounds like the issue that noranuko identified in this thread could be the cause. The problems with the queue that he describes sound like they could cause this issue, and the conditions that lead to this failure (more work entering the queue before the last batch finishes) seem to correspond to my experience of the problem being frequent in "bursty" connection conditions.

fmiceli24

30-05-2012 12:17:17

I am also experiencing the same problem.

I have a Debian 6 box with a
Bus 001 Device 003 ID 148f3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
And compat-wireless (2x00usb) version 3.2-rc1-1.

I have set up hostapd v1.0 with the following config
[code265n0gab]bridge=br0
interface=wlan0
driver=nl80211
ssid=pruebajuan
auth_algs=1
channel=1
hw_mode=g
wpa=0
ignore_broadcast_ssid=0
ctrl_interface=/var/run/hostapd
logger_syslog=-1
logger_syslog_level=0
logger_stdout=-1
logger_stdout_level=0
[/code265n0gab]

The AP works like a charm sometimes but others it will not even log the association requests. The change seems to happen when I restart hostapd service, but in order to restore it I have to restart it many times.

The problem seems to be random, but I have made some wireless captures with airpcap and it seems to be that when the AP does not allow associations no probe responses are sent from it upon probe requests from the client.

I am attaching the captures along with the hostap log the captures I made to be read with wireshark.

The problem also ocurred with hostapd v0.6 (the one included in the repositories of Debian 6) so I am inclined to think the problem is the driver.

I am willing to help any way I can testing patches or, with some guidance, looking at the code.

Thanks

fmiceli24

30-05-2012 12:45:52

One question

Has this change solved the problem?

[quotent75x0f9][codent75x0f9]while (!rt2x00queue_empty(rt2x00dev->rx)) {
entry = rt2x00queue_get_entry(rt2x00dev->rx, Q_INDEX_DONE);
+ if (!test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags) &&
+ !test_bit(ENTRY_DATA_STATUS_PENDING, &entry->flags))
+ {
+ rt2x00usb_clear_entry(entry);
+ queue_work(rt2x00dev->workqueue, &rt2x00dev->rxdone_work);
+ break;
}[/codent75x0f9][/quotent75x0f9]

fmiceli24

30-05-2012 16:54:23

I can confirm this also happens on the RT3090 (which is using rt2x00pci).

noranuko

04-06-2012 00:14:50

Hi,fmiceli24.

>One question
>
>Has this change solved the problem?

Yes.
In my environment, it seems to have solved.
But, it thinks that this change has influence only in a USB type WiFi dongle.
It does not operate by a PCI type.

fmiceli24

04-06-2012 11:33:47

I'll try it on my usb rt2800 and report back.
Thanks.

fmiceli24

12-06-2012 13:58:08

I can confirm the patch works on rt2x00usb.

I have encountered that the problem still exists but now is much less frequent. I have tested the stability of the connection downloading sistematically the same file over and over. After six hours or so I get dissasociated and can't associate again untill I restart hostapd.

It also seems that by that time mon.wlan0 stops receiving anything.

Is there anywhere else these queues should be flushed?

fmiceli24

29-06-2012 11:58:54

The problem got solved by a patch developed by Stanislaw Gruszka. I am attaching it to this comment.

The developer says that it could introduce other kind of problems. Haven't found any yet.

From f6663740c0186b7007a052535c16155ace7a742b Mon Sep 17 000000 2001
From Stanislaw Gruszka <sgruszka@redhat.com>
Date Mon, 25 Jun 2012 124503 +0200
Subject [PATCH] rt2x00usb do not check STATUS_PENDING bit on rx

ENTRY_DATA_STATUS_PENDING indicate state of TX entry, when we wait for
status for just sent frame. It has no sense to set/keep this bit for
RX entries, and seems doing that introduce race condition that make
possible RX path to stall.

Signed-off-by Stanislaw Gruszka <sgruszka@redhat.com>
---
drivers/net/wireless/rt2x00/rt2x00usb.c | 6 ++----
1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/rt2x00/rt2x00usb.c b/drivers/net/wireless/rt2x00/rt2x00usb.c
index d357d1e..26f5a27 100644
--- a/drivers/net/wireless/rt2x00/rt2x00usb.c
+++ b/drivers/net/wireless/rt2x00/rt2x00usb.c
@@ -344,8 +344,7 @@ static void rt2x00usb_work_rxdone(struct work_struct *work)
while (!rt2x00queue_empty(rt2x00dev->rx)) {
entry = rt2x00queue_get_entry(rt2x00dev->rx, Q_INDEX_DONE);

- if (test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags) ||
- !test_bit(ENTRY_DATA_STATUS_PENDING, &entry->flags))
+ if (test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags))
break;

/*
@@ -397,8 +396,7 @@ static bool rt2x00usb_kick_rx_entry(struct queue_entry *entry, void* data)
struct queue_entry_priv_usb *entry_priv = entry->priv_data;
int status;

- if (test_and_set_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags) ||
- test_bit(ENTRY_DATA_STATUS_PENDING, &entry->flags))
+ if (test_and_set_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags))
return false;

rt2x00lib_dmastart(entry);

fmiceli24

03-07-2012 13:32:06

Stanislaw has provided a different patch to solve this issue. One that does not cause a kernel panic (issue that could happen with the previous patch).

I have tested it for more than 18 hours straight of continuous activity. No problems so far in AP mode.

The patch is the one that follows.

[quote29o1kqwx]Subject [PATCH] rt2x00usb fix indexes ordering on RX queue kick

On rt2x00_dmastart() we increase index specified by Q_INDEX and on
rt2x00_dmadone() we increase index specified by Q_INDEX_DONE. So entries
between Q_INDEX_DONE and Q_INDEX are those we currently process in the
hardware. Entries between Q_INDEX and Q_INDEX_DONE are those we can
submit to the hardware.

According to that fix rt2x00usb_kick_queue(), as we need to submit rx
entries that are not processed by the hardware. It worked before only
for empty queue, otherwise was broken.

Note that for TX queues indexes ordering are ok. We need to kick entries
that have filled skb, but was not submitted to the hardware, i.e.
strted from Q_INDEX_DONE and have ENTRY_DATA_PENDING bit set.

From practical standpoint this patch fixes AP mode connection hangs.

Signed-off-by Stanislaw Gruszka <sgruszka@redhat.com>
---
drivers/net/wireless/rt2x00/rt2x00usb.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/rt2x00/rt2x00usb.c b/drivers/net/wireless/rt2x00/rt2x00usb.c
index d357d1e..74ecc33 100644
--- a/drivers/net/wireless/rt2x00/rt2x00usb.c
+++ b/drivers/net/wireless/rt2x00/rt2x00usb.c
@@ -436,8 +436,8 @@ void rt2x00usb_kick_queue(struct data_queue *queue)
case QID_RX
if (!rt2x00queue_full(queue))
rt2x00queue_for_each_entry(queue,
- Q_INDEX_DONE,
Q_INDEX,
+ Q_INDEX_DONE,
NULL,
rt2x00usb_kick_rx_entry);
break;[/quote29o1kqwx]

Kamaran

17-09-2014 06:36:18

Hi there!
this phenomena happened because of many reasons as noranuko mentioned before so I motivate noranuko's comment and idea and I personally agree with him and with his thoughts.Thanks for the post and thanks for noranuko for such a good answer.