Can't get iwlist to work with Rt2x00 Git kernel.

Live forum: http://rt2x00.serialmonkey.com/viewtopic.php?t=4610

drag

09-02-2008 18:31:24

I recently got a older Dell Inspiron 4100. (can you believe that the IT folks from work were going to throw it away?!)

It has a Mobile P3 proccessor running at 1133mhz with speedstep that'll migrate it's cpu speed down to about 733mhz or so.
512megs of ram.
20 gig drive.

It originally came with a broadcom wifi, but I removed it and replaced it with a rt2561 mini-pci since we have plenty of those badboys laying around.

Unfortunately after using the laptop for a while I would experience regular kernel panics with it. At first I thought it had to do with the X 'nv' driver since it would happen usually while playing video, but I manage to get it to panic once while in the virtual console and it looked like rt2x00 stuff was the main culprit. Which makes sense since I use sshfs to mount my video share from my server. But I can't manage to make it panic reliably so that sucks.

So I decided to try the rt2x00 git repository from you guys and enable the debugging support and see if I get panics there.

Unfortunately with the git kernel I can't seem to get it to even detect my wireless access point. (a wrt54g Linksys sitting in my closet 10 feet away) There is no encryption, no hidden ssids or anything like that. It's wide open to the world.

This is what iwlist normally shows in my house (using the Debian Unstable 2.6.24-1-686 kernel)
wlan0 Scan completed
Cell 01 - Address 0014BF32971F
ESSID"goob"
ModeMaster
Channel11
Frequency2.462 GHz (Channel 11)
Quality=52/100 Signal level=-44 dBm
Encryption keyoff
Bit Rates1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s; 18 Mb/s
24 Mb/s; 36 Mb/s; 54 Mb/s; 6 Mb/s; 9 Mb/s
12 Mb/s; 48 Mb/s
Extratsf=0000023027509b21

I attached the dmesg output from a example attempt to get the wireless card to work. I disabled network-manager and rebooted.
Then I ran (more or less, don't remember everything)
ifconfig
iwconfig
iwconfig wlan0 essid "goob"
ifconfig up
dhclient wlan0
bg
iwlist scan

This git was pulled this afternoon.

IvD

09-02-2008 18:55:07

Known issue, I'm working on it.

You could try to use the 2.6.24 kernel as intermediate solution. That kernel also contains rt2x00 but a slightly older version.

drag

09-02-2008 18:59:30

Cool.
Let me know when you think you have it fixed. I will happily compile any sort of kernel or whatever you want to throw at me.

yug

11-02-2008 22:21:04

I wish it's caused by the same issue as above, in doubt
(today git)
/lib/modules/2.6.24-666/kernel/drivers/net/wireless/rt2x00/rt2x00pci.ko
[code91398u0z]version: 2.1.1
srcversion: B109ACF6BC5B569FB15F89F[/code91398u0z]

That's a "classic" log when modprobing rt2500pci;
[code91398u0z]Feb 11 16:28:27 b1b1 kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> GSI 19 (level, low) -> IRQ 18
Feb 11 16:28:27 b1b1 kernel: phy0 -> rt2500pci_validate_eeprom: EEPROM recovery - NIC: 0xfff0
Feb 11 16:28:27 b1b1 kernel: phy0 -> rt2x00_set_chip: Info - Chipset detected - rt: 0201, rf: 0003, rev: 00000004.
Feb 11 16:28:27 b1b1 kernel: phy0: Selected rate control algorithm 'pid'
Feb 11 16:28:28 b1b1 kernel: phy0: HW CONFIG: freq=2412
Feb 11 16:28:28 b1b1 kernel: phy0 -> rt2x00mac_conf_tx: Info - Configured TX queue 0 - CWmin: 5, CWmax: 10, Aifs: 2.
Feb 11 16:28:28 b1b1 kernel: phy0 -> rt2x00mac_conf_tx: Info - Configured TX queue 1 - CWmin: 5, CWmax: 10, Aifs: 2.
Feb 11 16:28:28 b1b1 kernel: phy0: HW CONFIG: freq=2412
Feb 11 16:28:28 b1b1 kernel: phy0 -> rt2x00mac_conf_tx: Info - Configured TX queue 0 - CWmin: 5, CWmax: 10, Aifs: 2.
Feb 11 16:28:28 b1b1 kernel: phy0 -> rt2x00mac_conf_tx: Info - Configured TX queue 1 - CWmin: 5, CWmax: 10, Aifs: 2.
Feb 11 16:28:28 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:28 b1b1 kernel: phy0: HW CONFIG: freq=2417
Feb 11 16:28:28 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:28 b1b1 kernel: phy0: HW CONFIG: freq=2422
Feb 11 16:28:28 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:28 b1b1 kernel: phy0: HW CONFIG: freq=2427
Feb 11 16:28:28 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:28 b1b1 kernel: phy0: HW CONFIG: freq=2432
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2437
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2442
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2447
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2452
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2457
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2462
Feb 11 16:28:29 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:29 b1b1 kernel: phy0: HW CONFIG: freq=2412
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2412
Feb 11 16:28:30 b1b1 kernel: phy0: TX to low-level driver (len=42) FC=0x0040 DUR=0x0000 A1=ff:ff:ff:ff:ff:ff A2=00:08:a1:98:fa:0e A3=ff:ff:ff:ff:ff:ff
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2417
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2422
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2427
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2432
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2437
Feb 11 16:28:30 b1b1 kernel: phy0: HW CONFIG: freq=2442
Feb 11 16:28:31 b1b1 kernel: phy0: HW CONFIG: freq=2447
Feb 11 16:28:31 b1b1 kernel: phy0: HW CONFIG: freq=2452
Feb 11 16:28:31 b1b1 kernel: phy0: HW CONFIG: freq=2457
Feb 11 16:28:31 b1b1 kernel: phy0: HW CONFIG: freq=2462
Feb 11 16:28:31 b1b1 kernel: phy0: HW CONFIG: freq=2412[/code91398u0z]

quickly followed by
[code91398u0z]
Feb 11 16:28:38 b1b1 kernel: phy0 -> rt2x00mac_conf_tx: Info - Configured TX queue 0 - CWmin: 5, CWmax: 10, Aifs: 2.
Feb 11 16:28:38 b1b1 kernel: phy0 -> rt2x00mac_conf_tx: Info - Configured TX queue 1 - CWmin: 5, CWmax: 10, Aifs: 2.
Feb 11 16:28:38 b1b1 kernel: phy0: HW CONFIG: freq=2462[/code91398u0z]

And of course, I get "No scan result" (my AP is on channel 3)

Useless or not
cat /proc/interrupt gives
on 2.6.24 with 2.0.10 working
[code91398u0z] 18: 57882 IO-APIC-fasteoi 0000:00:0b.0[/code91398u0z]
on 2.6.24 wireless, rt2x00-git 2.1.1
[code91398u0z]18: 0 IO-APIC-fasteoi 0000:00:0b.0[/code91398u0z]
(Maybe normal for the kernel but at boot-up, the bios speaks about the irq 7)

IvD

11-02-2008 22:31:48

Could you enable debugfs and report what the contents of the "dev_flags" file is.

Also attached script might be usefull for dumping the device registers.
Please dump them from the working and the non-working setup into seperate files and attach them to this topic.

yug

12-02-2008 01:18:21

cat chipset
[quote1gj8n671]csr length 93
eeprom length 256
bbp length 64
rf length 5[/quote1gj8n671]

== NOT WORKING dev_flag value ==
[code1gj8n671]At startup (init 1) :
0x00001003
After a net.wlan0 start (failed) :
0x00001007
After a ifconfig wlan0 up (ok) :
0x0000102f
After a iwlist wlan0 scan (no scan result) :
0x0000100f[/code1gj8n671]

When the card is not working, the LED is off, and the register values not available in debugfs/ieee80211/phy0/rt2500pci (nothing new ))

== WORKING dev_flag value ==
[code1gj8n671]At startup (init 1) :
0x00001003
After a net.wlan0 start (ok) and after (eg: iwlist) :
0x0001012f[/code1gj8n671]

Attached 4 dumps when the card is working.

Can I do anything else ?

IvD

12-02-2008 17:27:53

Hmm, those 4 dumps, which dump is which status?
Which ones are taken with the working setup and which ones with the broken setup?

yug

13-02-2008 07:11:26

So there are 3 successive dumps with non-working module 2.1.1.

(I have to say that with the rate parameter in iface config file makes the 2.0.10 version working far better, I didn't imagine that was so important shock )

IvD

13-02-2008 16:55:02

Ah, but those logs look really interesting, I think I have a clue about what is going wrong....
Hopefully I can have a fix soon.

IvD

14-02-2008 16:44:44

You did make those dumps while the interface was up, right?

yug

15-02-2008 10:42:27

Hum, don't remember. One of them must be.
So I sending another set of more "precise" dumps.

---- 2.1.1 LEDS always off
//#boot in INIT 2, at startup, essid=""; iface down
ralink-NOTwork-2.6.24-666-1203069309
//ifconfig wlan0 up
ralink-NOTwork-2.6.24-666-1203069357
//iwlist wlan0 scan #(no scan result)
ralink-NOTwork-2.6.24-666-1203069408
ralink-NOTwork-2.6.24-666-1203069416
//#again
ralink-NOTwork-2.6.24-666-1203069712
//modprobe -r rt2500pci && modprobe rt2500pci && sleep 2
ralink-NOTwork-2.6.24-666-1203069757
//ifconfig wlan0 up
ralink-NOTwork-2.6.24-666-1203069784
ralink-NOTwork-2.6.24-666-1203069842
//iwlist wlan0 scan (no scan result)
ralink-NOTwork-2.6.24-666-1203069858

---- 2.1.1 again
//#boot in INIT 3 (gentoo can't bring net.wlan0 up)
ralink-NOTwork-2.6.24-666-1203070035
//ifconfig
ralink-NOTwork-2.6.24-666-1203070098
//ifconfig wlan0 up
ralink-NOTwork-2.6.24-666-1203070132


---- 2.0.10 LEDS on since INIT 3
//#dumps when card is well working since I put the rate parameter
ralink-work-2.6.24-1203067144
ralink-work-2.6.24-1203067267
ralink-work-2.6.24-1203067508
ralink-work-2.6.24-1203068123
//#another one after a reboot
ralink-work-2.6.24-1203070743
//iwconfig wlan0 rate 54M #because it was strangely set to 1M
ralink-work-2.6.24-1203070830

A modified version of the registerDumper (in attachment too)

[code3a7d03na]#!/bin/sh
#list of different registers between 2 dumps
if [ $# -eq 2 ]; then
diff "$1" "$2"|
sed 's/^.* \([0-9]\+\) .*$/@@\1/Ig'|grep '@@'|sort -n|uniq|cut -d'@' -f3;
exit 0;
fi;
#else : dump
reg=(bbp csr eeprom)
path="`mount|grep debugfs|sed 's@^[^/]*/\([^ ]*\) .*$@/\1@Ig'`/ieee80211/`dmesg |tail -150|grep phy|tail -1|sed 's/^\(phy[0-9]\).*$/\1/'`/`lsmod|cut -d' ' -f1|grep rt[0-9][^x]`"
file="."
stat="work"
if [ -d "$path/register" ]; then
path="$path/register";
file=".."
stat="NOTwork"
fi;
logname="/tmp/ralink-$stat-`uname -r`-`date +%s`"
echo $logname
cd $path;
(echo "dev_flags:"
cat "$file/dev_flags";
for j in ${reg[*]}; do
echo "$j:"|tr "[:lower:]" "[:upper:]";
for ((i = 0; i < `grep $j "$file/chipset" | cut -d : -f 2` ; i++)); do
echo -n "$i "
echo $i > "$j"_offset
cat "$j"_value
done;
done
) > $logname;
exit [/code3a7d03na]

IvD

15-02-2008 14:09:41

Could you retest with latest rt2x00.git,
it should be fixed now.

yug

15-02-2008 19:55:09

I just did a try
in attachment
[code1likq7xw]init 3 # net.wlan0 failed, wlan0 down
[dump]
[dump]
ifconfig wlan0 up
[dump]
[dump]
ifconfig
[dump]
iwlist wlan0 scan
iwlist wlan0 scan
[dump]
modprobe -r rt2500pci
# let's have a new try
modprobe rt2500pci
[dump]
ifconfig
[dump]
ifconfig wlan0 up
[dump]
[dump]
iwlist wlan0 scan
[dump][/code1likq7xw]

(to replace [dump] by its ordered name but need to replace ### by the "quit" sed command)
[code1likq7xw] for i in `ls ralink-*`; do sed -n "s/^\[dump\]$/$i/###" dbug; done[/code1likq7xw]

Any better way to help you ?
Please, excuse my noobness, giving so much information and not the small interesting ones.

AdamBaker

16-02-2008 15:25:25

You needed to be a bit careful what version you tested with yesterday, the patch

rt2x00 Fix Descriptor DMA initialization patch

was the one that mode rt61 start working and the patch

rt2x00 Cleanup mode registration

broke it until the patch

Fix hw mode registration with mac80211.

was applied.

If the tree you tested with was either missing the 1st patch or had the 1st 2 but not the third then it is worth trying again.

drag

16-02-2008 19:59:34

I am able to get it to work, sorta. It's very unreliable. No oopsies or anything like that, but the signal level seems low, the rate seems stuck to 1M only, but I got it to run 2M for a bit.

I was able to get network-manager to connect to a neighbor's ap quite easily, but when I try to connect it to my own it was unable. Trying iwlist and such it would not find anything.

So that is how it's working now.. I am able to connect to a AP once and scans right after a reboot can find the APs in my area, but after a little while it iwlist won't return anything but the AP I am connected to, and even a little bit further it won't return anything. After a few minutes the connection dropped out and I was not able to connect to anything.

A reboot would make it work for a little while again. Not sure what is going on. Right now I am recompiling a kernel with all the debugfs stuff on, which I forgot for this run..

IvD

16-02-2008 21:34:08

Could you try increasing the rate using 'iwconfig wlan0 rate 54m'
also when debugfs is enabled could you check what the file contents is of <debugfsroot>/<mac80211 root>/rt2500pci/queue/queue

check the contents after the TX/RX are failing.

AdamBaker

16-02-2008 23:21:54

I'm seeing similar poor data rates with b43 using the latest rt2x00 kernel so I'd be tempted for now to lay the blame for that on mac80211 changes. I have also seen b43 drop the link more often than it used to but unlike rt61 it recovers with an ifdown / ifup sequence so for now I'm assuming mac80211isn't to blame for that.

(For test purposes I'm running a pcmcia rt61 in a machine that has onboard bcm4318)

drag

17-02-2008 01:41:43

Sorry this took so long. Life got in the way. It took waayyy.. to long. Takes about a hour to compile a kernel and I tried to slim the config down and missed a bunch of modules. It's been a long time since I tried to customize a kernel.. used to do it every upgrade. Oh well.

If there is anything else I can do please ask.. it won't take nearly as long this time.


I just copied a bunch of those queue files and the commands I ran right before I copied them. You'll see what I mean..
457 cp /debug/ieee80211/phy0/rt61pci/queue/queue ./queue.before
458 cat queue.brought.up.first
459 sudo iwconfig
460 sudo ifconfig wlan0 up
461 cp /debug/ieee80211/phy0/rt61pci/queue/queue ./queue.iwconfig.wlan0.up

etc etc.

I went
iwconfig wlan0 up
iwlist scan
iwconfig wlan0 essid "goob"
dhclient wlan0
ping -f -c 1000 192.168.0.254
iwlist scan
ifconfig wlan0 down

All that worked just fine. Once the device was brought back up it stopped working.

ifconfig wlan0 up
iwlist scan
iwconfig wlan0 rate 54M
(ya I know it's pointless after its failed, but whatever.
iwlist scan
This was the last queue copy I made... queue.iwlist.scan.fourth.time

IvD

19-02-2008 11:47:07

Please test attached patch to see if that helps.

IvD

19-02-2008 18:11:30

Instead of the patch, please update to latest rt2x00.git version. That one contains above patch + 2nd fix.

drag

24-02-2008 10:11:23

This post took me way to long again. Sorry, it took me a long time to get a working kernel out of the latest switch to 2.6.25-rc2. Absolutely not your fault, just me being stupid.

So this is the best I can do, even though I'd like to do more. I wrote a script that make a copy of the entire debugfs mount. I mounted it at /debug and tried to get all the files I could, but lots of them came up with 'bad address' errors so I could not copy them or cat them.

so the script ran a command, made a copy of the /debug, then ran the next command, etc etc.

A copy of the script is in the tarball. Hopefully there is something useful in there for you.

drag

25-02-2008 03:08:06

Oh ya.

That was using the latest git 2.6.25-rc2 version, downloaded it and let it compile most of Saturday. Same issue as was before and the details on the commands, debugfs, and failures/successes is all documented in those two tarballs.

Would it be helpful if I sent you some hardware? I tried recording as much information as possible. but if you don't have a mini-pci rt2561 card I have a spare one laying around that I can send.

I am also compiling the a preemptive kernel, if that is a problem.

IvD

25-02-2008 14:59:19

I have a rt61 device for testing purposes, the problem is more in the available time for me to test. ;)

I have committed a fix to git that might solve the problem.

drag

25-02-2008 18:50:43

Cool. I'll try out the new git patch as soon as possible.

Please check out the tarball, though, if you can. My very simple script in there will create a copy of the debugfs..
at least as well as I can figure out how to do it.. I use tar to create the directories and use cp to copy the individual the text files. I have debugfs mounted at /debug.

It'll create a copy doing that after each command I went through to reproduce the problem. Right now it'll reliably reproduce the problem with 100% accuracy on my laptop with a completely open access point and it has a log showing the output from the commands that I used.

It brings the device up, runs a iwlist scan, collects to the ap, grabs dhcp lease, ping floods it with a thousand pings, brings the device down, then brings it back up, and does two iwlist scans. Each command is spaced 2 seconds apart except for the last iwlist scan which is done at 20 seconds. After that in order to connect to the AP again I have to reboot. I believe I have a iwconfig set rate command in there, but it doesn't really make a difference to the iwlist scan stuff.

Also if I am doing something very stupid then you can yell at me. Thanks for your time, btw. I appreciate it.

IvD

25-02-2008 19:05:20

Cool. I'll try out the new git patch as soon as possible.
[/quoteef8k04de]

Thanks.


Please check out the tarball, though, if you can. My very simple script in there will create a copy of the debugfs..
at least as well as I can figure out how to do it.. I use tar to create the directories and use cp to copy the individual the text files. I have debugfs mounted at /debug.
[/quoteef8k04de]

The problem with that script is that it doesn't handle the rt2x00 debugfs entries well. The rt2x00 debugfs entries are a bit more magical then plain text like the mac80211 entries.

For example
the frame/dump entry should be read continuously to get all incoming and outgoing frames inside the device. (Thus obtained at a much lower level then wireshark getting its information from wlan0). To use the incoming data you need some scripts from Mattias Nissler to convert the data into something wireshark can handle.

The register/* entries are mappings to the device register, you need to write a word index to for example csr_offset and csr_value will contain the register value for that offset. (Same goes for the BBP, EEPROM and RF registers which each have their own *_file and *_value entry).

drag

26-02-2008 01:00:53

Cool.

That'll give me something to chew on for a while. Thanks for the information.

IvD

26-02-2008 10:45:46

In a few threads I have attached a script that can dump all registers for a device into a single list.
At the moment the computer I have stored that script on has died (well at least the power supply did), so I can't attach it at the moment. But if you search for those other threads your very likely to find it.

Hazzl

04-03-2008 21:20:17

Wouldn't it be possible to store this script somewhere on this website, so that we don't need to attach it to multiple threads?