[rt2x00-users] [PATCH] rt2x00: Don't queue ieee80211 work after USB removal

Ivo van Doorn ivdoorn at gmail.com
Sat Oct 31 08:56:40 UTC 2009


On Saturday 31 October 2009, Sean Cross wrote:
> 
> On Oct 30, 2009, at Fri 30, 5:51 PM, Ivo van Doorn wrote:
> >> #include <linux/kernel.h>
> >> #include <linux/module.h>
> >>
> >> @@ -362,8 +363,9 @@
> >>
> >>    rt2x00link_reset_tuner(rt2x00dev, false);
> >>
> >> -   ieee80211_queue_delayed_work(rt2x00dev->hw,
> >> -                 &link->work, LINK_TUNE_INTERVAL);
> >> +   if(test_bit(DEVICE_STATE_PRESENT, &rt2x00dev->flags))
> >> +      ieee80211_queue_delayed_work(rt2x00dev->hw,
> >> +                  &link->work, LINK_TUNE_INTERVAL);
> >> }
> >>
> >> void rt2x00link_stop_tuner(struct rt2x00_dev *rt2x00dev)
> >> @@ -469,8 +471,10 @@
> >>     * Increase tuner counter, and reschedule the next link tuner run.
> >>     */
> >>    link->count++;
> >> -   ieee80211_queue_delayed_work(rt2x00dev->hw,
> >> -                 &link->work, LINK_TUNE_INTERVAL);
> >> +
> >> +   if(test_bit(DEVICE_STATE_PRESENT, &rt2x00dev->flags))
> >> +      ieee80211_queue_delayed_work(rt2x00dev->hw,
> >> +                  &link->work, LINK_TUNE_INTERVAL);
> >> }
> >
> > Actually these 2 changes are incorrec, what happens when the device  
> > disapears between scheduling
> > and starting the thread? Wouldn't that cause the same error you are  
> > trying to fix? Wouldn't checking
> > the flag at the start of the thread be more meaningful?
> 
> 
> Perhaps it does make more sense to test for DEVICE_STATE_PRESENT in  
> rt2x00link_start_tuner, just so rt2x00link_reset_tuner() isn't called  
> on a card that is no longer present.
> 
> But the timing issue I'm seeing is something like this:
> 
> 1) Card is removed
> 2) ENODEV messages start getting received.
> 3) Something decides to start the tuner.
> 4) link->work gets queued to happen sometime in the future.
> 5) The USB system finally decides to unload the driver.  I think  
> rt2x00usb_disconnect() gets called here.
> 6) link->work gets run, but as the device has been unloaded, the  
> delayed_work is no longer valid, so a KP occurs.
> 
> The root cause is that link->work is suddenly no longer valid, so the  
> flag can't get checked at the start of the thread.  Since I think  
> ENODEV will get returned before rt2x00link_start_tuner() is called, I  
> believe this is a valid method of catching the removal.

The problem is more between 5 and 6 in this case, the mac80211 queue
must be flushed so it must wait until all tasks have completed. Nothing
should have been deleted that must be used when the thread start.
If the work structure is gone, that means rt2x00dev is gone, which really
shouldn't happen.

Don't get me wrong, you patch is still be valid, it is just that I believe it is
not covering the entire bug. ;)

Ivo



More information about the users mailing list