(solved) pm-utils hibernate issue

Live forum: http://rt2x00.serialmonkey.com/viewtopic.php?t=4808

xsism

22-05-2008 19:37:11

I have spent the better half of 3 days trying to get ubuntu hardy to hibernate without freezing on resume or any other part. I have tried different hibernate scripts like TuxOnIce and uswsusp with no difference in the outcome.

Finally I noticed in /var/log/pm-hibernate.log that rt2570 wasn't being unloaded from the modules list and this was causing the error. I did it manually and saw it was still in use. So I ran
[code2n23f1qp]sudo /etc/init.d/networking stop
[/code2n23f1qp]

followed by
[code2n23f1qp]sudo modprobe -r rt2570[/code2n23f1qp]

and it worked just fine.

So I added a custom script named `01custom.sh` to /usr/lib/pm-utils/sleep.d/ that looks like this
[code2n23f1qp]#!/bin/bash

/etc/init.d/networking stop
rmmod rt2570[/code2n23f1qp]

Then I clicked on hibernate from the top right power menu and the system went down like it should. I waited 30 seconds and booted back up and it worked perfectly.

I just had to manually restart networking and rt2570 as wlan0.

---

My point rt2570 doesn't get unloaded properly. PM-utils use DBUS to `sleep` the NetworkManager. I'm guessing by default, with rt2500usb, NetworkManager would sleep or unload the rt2500 module, but it doesn't know how to interface the rt2570 module.

Is there any way to have the rt2570 module unload properly automatically? Is there a version that supports ACPI hibernate or suspend?

PS - I was going to post this in the ubuntu forums, but this is clearly an issue with rt2570. The only reason I am not using rt2500usb is because I can't connect and get an IP on my Linksys WUSB54g(even though I can see the available networks).

Vern

08-06-2008 23:22:49

Is there a version that supports ACPI hibernate or suspend?[/quote1vou4saz]The attached patch - pm1.patch.gz - implements suspend/resume for the rt2570. After you apply it, you should not need to unload then reload the driver when you click on hibernate.

Would you like to try it? If it doesn't work, please build and run with debug enabled, then attach a gzipped copy of the log data here. If it works OK, I'll see if I can get it in CVS.

Thanks,

Vern

15-06-2008 17:42:05

Well, this thing has been up for a week, has been downloaded, and there are no complaints.

So ...

Going, going, gone. You should start seeing it in the hourly tarballs Soon.

aatoma

19-06-2008 08:28:13

Hi Vern,

I had the same exact issue with a DLink DWL-G122, the system used to hang when suspending/hibernating so, yesterday, I tried the daily snapshot marked '20080618'.
Now I'm able to suspend/resume multiple times in a row and get wireless working flawlessly on resume, but still no luck with hibernate.
Is there any chance to get hibernate fixed with a similar patch or that's a really different issue?

Thank you and regards.
Antonio.

Vern

19-06-2008 17:16:19

Hi aatoma,

Thanks for the unit test. To the best of my knowledge, hibernate should be OK - unless there's something I've missed.

What are the actual symptoms that cause you to conclude that there's "no luck with hibernate"?

Thanks,

aatoma

20-06-2008 10:28:23

Hi Vern,

as it was the case for suspend, before applying your patch, when I try to hibernate my PC, the screen goes off, but the PSU and GPU fans keep on turning.
When this happens, there is no way of getting it right other than keeping the power button down for several seconds and rebooting.
On the other hand, if I don't plug in the USB wireless dongle at all, both suspend and hibernate work perfectly.

Let me know if I can help you troubleshooting this by posting some log files or config info.

Thank you and regards.
Antonio.

Vern

21-06-2008 18:58:47

Hi aatoma,

Thanks for volunteering the troubleshooting help. As it turns out, there was something I missed.

I've looked into the matter a little bit. It turns out that the driver uses slightly different context mechanisms than are assumed by the suspend-to-disc (STD aka hibernate) capability. It'll take a little navel contemplation to work it all out.

In the meantime, you may try a workaround using the procedure described by xsism in his original post in this thread.

When I do get something worked up, I'll attach the patch to a posting in this thread and we can take it from there.

Thanks,

Vern

22-06-2008 19:10:52

Hi aatoma,

Can you try the attached patch - frz1.patch.gz? If everthing's OK, it should let you hibernate. I don't know very much about this, and I've made a couple of working assumptions

[list4xnvc5ya]The freeze logic need not be very fine- grained.
A process gets a "dummy" signal to kick off hibernation.[/listu4xnvc5ya]

Needless to say, the more you can bang this thing around, the better. Let me know your results here.

If there are problems, could you compile and run with debug enabled, then attach a gzipped copy of /var/log/debug to a posting here?

Thanks,

aatoma

23-06-2008 21:46:47

Hi Vern,

I tried your patch but nothing changed. Here is the debug log file I got after suspend (working on Jun 23 2334) and hibernate (not working on Jun 23 2338).

Hope this helps.
Antonio.

Vern

24-06-2008 18:54:57

Hi aatoma,

Thanks for working up the debug info, unfortunately there are no messages in syslog.gz - not even the ones labeled "rausb0" - that are issued by the driver. Did you compile with debug enabled? Did you run modprobe with debug=31?

Are debug messages routed to /var/log/syslog (I guess that's what you have) as well as to /var/log/debug? If not, could you attach a gzipped copy of /var/log/debug to a posting here?

Even though there's no debug info, its evident that the patch didn't achieve its purpose. I've attached a slightly modified version - frz2.patch.gz - that's designed to get a little more info to test my assumptions. Could you apply it, build with debug enabled, then run it with "debug=31" supplied as one of the parameters to modprobe?

Having a gzipped copy of /var/log/kern.log (or your system's equivalent) as well as of /var/log/debug attached to your response here will be helpful.

Thanks,

aatoma

25-06-2008 18:39:58

Hi Vern,

I took the following steps

[code1lh4oahh]#patch -p1 < frz2.patch[/code1lh4oahh]
that patched 2 files, then
[code1lh4oahh]#make debug
#sudo make install
#sudo modprobe rt2570 debug=31[/code1lh4oahh]

and then suspended my PC.
Here is the outcome, hope this helps.

Antonio.

Vern

26-06-2008 01:32:43

Hi aatoma,

Thanks for going through the work of generating these logs.

It looks like in general, the system does what I thought on hibernate. However, I was trying to be too cute, and as a result, the driver ends up exiting rather than trying to freeze.

I see too, that we're getting "kernel BUG at include/linux/timer.h ..." messages. I strongly suspect this is due to a bug in the earlier suspend/resume implementation - namely I need to cancel timers. So this version of the patch tries to fix that, too.

Anyway, the attached patch - frz3.patch.gz - should fix that up. It modifies rt_config.h as follows
Conditionally #include <linux/freezer.h> if > 2.5.18.
Only define set_freezable, etc. if kernel < 2.5.18.

It also modifies the suspend/resume logic in rtusb_main.c to better quiesce/reactivate the device.

Could you please do the whole thing all over again?

Thanks,

aatoma

27-06-2008 21:26:59

Hi Vern,

seems like we have broken something this time ... (.
I applied patch #3 but, when I tried to hibernate my PC, the screen stayed on with several error messages and then the PC rebooted spontaneously.
Also, after rebooting, I lost control of my keyboard and so had to reboot once more.
I attached three log files as usual.
In the meantime I reverted the driver to 20080618 snapshot.

Regards.
Antonio.

aatoma

28-06-2008 17:01:15

Hi Vern,

I was no longer sure I did everything correctly, so I tried once more.
This time I couldn't event start rausb0 (and I'm no longer sure it was working when I first tried to hibernate).
I attached the logs.

Thank you and regards.
Antonio.

Vern

29-06-2008 18:22:43

Hi aatoma,

Sorry to put you through such a siege. Thanks for your efforts, and for the log data. Unfortunately, I don't see any debug info in your latest logs. Did you compile with debug enabled? Did you supply "debug=31" as a parameter to modprobe?

What happens if you just suspend-to-ram (sleep) instead of suspend-to-disk (hibernate)? Looking at your kern.log data, it looks like the driver threads are doing the same thing as all the other processes on suspend-to-disk (i.e. that part *may* really be "working"); but they may be exacerbating the suspend-to-ram (which is done as part of suspend-to-disk) symptoms. So could you reapply frz3.patch.gz to the latest CVS, run with debugging on, and try just suspend-to-ram? If we can get some debug data from that, it'll really help.

Thanks,

aatoma

29-06-2008 20:08:00

Hi Vern,

this time should be fine (lot more stuff in the log files). I tried first to suspend (to ram) but, after a few messages on the screen, xfce restarted spontaneously. Then I tried to hibernate (to disk) getting some messages on the screen and then the PC hanged as usual.
I hope this will help you.
Feel free to ask me for additional details since I would really love to have suspend/hibernate fully working.

Regards.
Antonio.

aatoma

30-06-2008 08:08:01

Hi Vern,

just to add a bit of information, when I tried to suspend-to-ram, I saw an error message that I don't see in the log files, stating two tasks failed to freeze (or something similar).
The two tasks were 'rt2570Mlme' and 'rt2570Cmd' so I think this is relevant.

Regards.
Antonio.

Vern

30-06-2008 21:56:45

Hi aatoma,

Looking at your latest log files, I'm starting to think I need a baseline. Could you build the latest CVS - unpatched - with debug enabled, do suspend, then resume (not hibernate) and post a gzipped copy of /var/log/debug here?

BTW the only difference between the latest CVS and that of 20080618 is that the latest CVS detects when an ARM target is compiled in big endian mode. Nothing we're doing here goes into CVS until we're satisfied that it works,

Anyway, thanks again for your work,

aatoma

02-07-2008 20:28:24

Hi Vern,

here is the debug log of a suspend(not hibernate)/resume session with a freshly downloaded (20080702) cvs tar.
Hope it helps.

Good hunting.
Antonio.

Vern

02-07-2008 20:47:56

I'm really slipping. I should also have asked if this resulted in a "kernel BUG" message in /var/log/kern.log. Did it?

If it did, could you post a gzipped copy of /var/log/kern.log too?

Thanks,

aatoma

03-07-2008 20:03:18

Hi Vern,

no evidence about the kernel bug in log files from July 2nd.
Anyway I attached the whole set of log files for completeness

Thank you and regards.
Antonio.

Vern

04-07-2008 01:22:44

Hi aatoma,

Thanks for the additional log info.

After untarring your log file and gzipping the components, I see there are no "kernel BUG ..." messages in kern.log; so the current suspend logic in CVS is probably OK (or at least does no harm), and it looks like I should have left it alone.

The system's resume activity is not what I expected. Here's an extract of your kern.log content[code2das67mk]# zegrep -n '_suspend|_open|_close|_probe|_discon|: exit\b|: init\b' $aato/080703/k*|cut -d: -f1,5-
...
5906: [ 332.896877] rt2570: ---> rt2570_suspend()
5976: [ 332.960818] rt2570: <-- common_suspend()
6033: [ 3.759344] rt2570: --> usb_rtusb_disconnect
6036: [ 3.774758] rt2570: --> usb_rtusb_close
6049: [ 4.767941] rt2570: <-- usb_rtusb_close
6057: [ 5.037175] rt2570: <-- usb_rtusb_disconnect
6065: [ 5.607221] rt2570: --> usb_rtusb_probe (2.6)
6205: [ 7.766693] rt2570: <-- common_probe: Status = 0
6206: [ 7.766710] rt2570: <-- usb_rtusb_probe: res=0
6211: [ 9.014419] rt2570: --> usb_rtusb_open: driver version 1.0.0
6224: [ 9.019325] rt2570: <-- usb_rtusb_open: OK[/code2das67mk]As you see, we call the suspend, but some time later, we do a disconnect->probe->open sequence instead of the driver's resume function. Are we sure we're not waking up from a suspend to disk, rather that resuming from a suspend to RAM?

If the answer to that question is yes, I'll modify the patch to leave the current suspend/resume logic alone and just add the freeze capability.

Thanks,

aatoma

04-07-2008 10:29:40

Hi Vern,

I'm sure this time it was suspend/resume.
Also note that usual linux boot message sequence

[code1p85a363]
Jun 29 21:53:51 mythbuntu-1 kernel: [ 0.000000] Initializing cgroup subsys cpuset
Jun 29 21:53:51 mythbuntu-1 kernel: [ 0.000000] Initializing cgroup subsys cpu
Jun 29 21:53:51 mythbuntu-1 kernel: [ 0.000000] Linux version 2.6.24-19-generic (buildd@palmer) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Wed Jun 18 14:43:41 UTC 2008 (Ubuntu 2.6.24-19.34-generic)
[/code1p85a363]

is missing, while I always get it when I hibernate, since I have to restart my PC.
I assume that disconnect-->probe-->open has probably been used to bring back the usb card up after
suspend and I agree that it is safer to leave it there.
Looking forward to test your new patch.

Regards.
Antonio.

Vern

04-07-2008 17:58:59

Hi aatoma,

OK. This version - frz4.patch.gz - leaves the suspend/resume logic unchanged from what it now is in CVS and just adds (hopefully) hibernation to the driver threads.

Can you try it and see what happens?

Thanks,

aatoma

10-07-2008 18:52:21

Hi Vern,

sorry for the late reply.
I tried your latest patch but got the same result.
The screen goes black, then I get an error message about two tasks failing to freeze (rt2570Mlme and rt2570Cmd) and then xfce resumes.
As usual I attached the three logs.

Regards.
Antonio.

aatoma

18-07-2008 18:06:38

Giving up?

(

Vern

18-07-2008 20:57:19

No.

It turns out that suspend/resume as currently implemented works mostly on dumb luck, and is not robust in the presence of wireless link activity. Looking more closely into the "kernel BUG" messages in your log, it's apparent that there are several mutual exclusion and serialization problems that need to be fixed; so I'm contemplating my navel, here, in order to make some improvement.

Thanks,

Vern

20-07-2008 21:05:28

Hi aatoma,

I've updated the patch - attached as frz5.patch.gz - yet again. It seems to work without complaint for me, but I suspect I'm not doing everything you're doing. Could you try it out in two stages?

First, try just suspend/resume. If that seems OK, then try freeze.

Thanks,

aatoma

21-07-2008 20:43:05

Hi Vern,

this time the behavior of suspend has been similar to that of hibernate.
The screen goes black, then I get the error message about the two tasks failing to freeze within 20 seconds
(the names of the two tasks this time changed to rausb0-Mlme and rausb0-Cmd), then the leds on the wireless dongle are switched off for a few seconds and finally xfce (and wireless as well) resumes.
I also tried to hibernate getting similar results.

I attached the three logs obtained during a suspend attempt.

Regards.
Antonio.

Vern

25-07-2008 20:31:31

Hi aatoma,

Well, once more into the fray.

I suspect that part of the problem has to do with the use of semaphores instead of wait queues for synchronization. It may be that I've misunderstood that interplay. It *may* be that a thread in a semaphore wait state is "automagically" frozen. Accordingly, I've modified the patch - attached as frz6.patch.gz - to comment out try_to_freeze calls and use a semaphore to synchronize suspend/resume functions with other activity to depend on whether or not the device has been opened.

To be honest, I'm unable to compile a version of 2.6.25 in whidh the "state" file in /sys/class/net/rausb0/device/power/state exists. So I'm flying a little blind, here. Could you share the settings for the power management and suspend entries in your .config file? Maybe that'll help me.

So could you try the patch first with just a selective suspend/resume? If that seems OK, then with suspend-to-disk?

Thanks,

aatoma

26-07-2008 15:13:40

Hi Vern,

here are the results after applying your latest patch.
I tried to suspend but nothing changed.
Since I'm not sure about the file you are referring to with '.config', I attached

/etc/default/acpi-support
/usr/lib/pm-utils/functions

for power management configuration.

Regards.
Antonio.

aatoma

28-07-2008 08:28:36

Hi Vern,

this may sound weird.
After messing up things with different versions of the rt2570 driver, I decided yesterday to have a cleaner situation.
So I installed vanilla rt2570 from daily snapshot.
Then I checked if suspend was still working after all the things I had changed and restored and it did (as I expected).
Then I tried to hibernate and it did hang as usual, but I left it in that state.
The strange thing is when I came back it was shutdown!
I restarted it and it resumed from hibernation (I intentionally left some applications open).
So I tried again and stayed there after more than two minutes it does hibernate!
I'm not sure I waited for so long before shutting down it manually in my first attempts.
So my guess is are we chasing a solution to the wrong issue?
Maybe the problem is why does it take so long (say 2 minutes instead of 20 seconds) to hibernate when the rt2570 driver is loaded?

I'll have some further testing and come back to you.

Regards.
Antonio.

Vern

31-07-2008 20:44:42

Hi aatoma,

Sorry to keep you dangling out there so long. Hope you haven't spent too much time looking into the weird "eventual restart" symptoms you described.

I eventually figured how to sorta get STD-STR working on my own system, and to get the driver to freeze/unfreeze and suspend/resume. Of course, once the logic was actually exercised, I found several problems with it, and that is what I've been taking care of.

I've attached the patch as frz7.patch.gz. I probably haven't covered everything, but it does seem to work OK for me. Could you try it and let me know your results?

Thanks,

aatoma

02-08-2008 17:35:01

Hi Vern,

great job! Things are really getting right!
Not only now I'm able to suspend/resume and hibernate/resume but, after applying your latest patch,
the time it takes to hibernate after loading the driver is in line with the one it takes without loading the driver.
So everything seems to be OK, and I plan to leave this patched driver online for some random testing.
I would like to thank you so much for your effort and will let you know about any other issue I should encounter.

Thank you and regards.
Antonio.

Vern

03-08-2008 18:50:22

Hi aatoma,

Great!

Since you've evidently been running for a while with no problems, it looks like we've got a fix. Accordingly, I've submitted the patch for inclusion in CVS. You should see it show up in the hourly tarballs soon if all goes well.

I'm marking the topic as solved. Thanks again for all your work.

Vern

11-08-2008 16:15:01

hormati

I've split the posts in the part of this thread that deal with your problem into a new topic "Blank Screen on Hibernate". Please use that.

Thanks,

aatoma

12-08-2008 07:19:18

Hi Vern,

this is just to let you know (been away for a few days 8) ) that I'm happily running
an unpatched version of the rt2570 cvs snapshot without a problem and to thank you
once again.

Best regards.
Antonio.

Vern

14-08-2008 16:33:13

You're welcome. Nice to get an attaboy.

Thanks,