Re: Since Linux 4.13 tlp or powertop usage cause "xHCI host controller not responding, assume dead" on Dell 5855

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10.04.2018 12:15, russianneuromancer@xxxxx wrote:
Hello!

On Dell Venue 8 Pro 5855 tablet installing tlp or running "powertop --
auto-tune" cause "xHCI host controller not responding, assume dead"
error, when error happen two integrated USB devices (Bluetooth adapter
and LTE modem) disappear until reboot. First time this issue was
observer in Linux 4.13 and still present in Linux 4.16. Blacklisting
both "Linux Foundation 3.0 root hub" from autosuspend in tlp
configuration is workaround for this issue, however on other devices
tlp works fine without blacklisting usb hub autosuspend, and on this
tablet there was no such issue before (at least in Linux ~4.8-4.12
range) so I assume there is regression somewhere.

Is there any related commits between 4.12 and 4.13 that I could try to
revert?


In 4.12 there was a added sensitivity to react to hotplug removed
xhc controllers, i.e. if we read 0xffffffff from a xhci register
we assume host is removed and start cleaning up.

commit d9f11ba9f107aa335091ab8d7ba5eea714e46e8b
    xhci: Rework how we handle unresponsive or hoptlug removed hosts

You can try to revert that, but as a final solution we should
find the real rootcause

How issue looks like in logs:

[  227.258385] xhci_hcd 0000:00:14.0: xHC is not running.
[  329.671544] xhci_hcd 0000:00:14.0: xHC is not running.
[  416.695796] xhci_hcd 0000:00:14.0: xHC is not running.

The "xHC is not running" is the xhci driver handing a port event
interrupt for a resuming port, but whole host controller is not running.
We stop the host controller in xhci_suspend(), and start it in xhci_resume()

Attaching a patch that improves preventing xhci host suspend during
USB2 resume signaling.
Could help, worth a shot.

[  416.695862] xhci_hcd 0000:00:14.0: xHCI host controller not
responding, assume dead

This means xhci_hc_died() was called, many possible places.
Adding the code below could give a hint:

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index daa94c3..51fb3d0 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -900,7 +900,8 @@ void xhci_hc_died(struct xhci_hcd *xhci)
        if (xhci->xhc_state & XHCI_STATE_DYING)
                return;
- xhci_err(xhci, "xHCI host controller not responding, assume dead\n");
+       xhci_err(xhci, "%ps: xHCI host controller not responding, assume dead\n",
+                __builtin_return_address(0));
        xhci->xhc_state |= XHCI_STATE_DYING;
xhci_cleanup_command_queue(xhci);

[  416.695900] xhci_hcd 0000:00:14.0: HC died; cleaning up
[  416.696052] usb 1-3: USB disconnect, device number 2
[  416.815610] cdc_mbim 1-3:1.12 wwp0s20u3i12: unregister 'cdc_mbim'
usb-0000:00:14.0-3, CDC MBIM
[  416.847934] usb 1-4: USB disconnect, device number 3

After that Bluetooth adapter and LTE modem disappear from lsusb output,
while xHCI controller itself remain visible.

we stop the host activity in xhci_hc_died(), no usb devices under this host will work.

Complete dmesg: https://paste.fedoraproject.org/paste/7aMpVGLfZ82zppdGs
56Oqg
lsusb -v: https://paste.fedoraproject.org/paste/c7y8GisC13YdzcYE9B-JIw
dsdt.dsl: https://paste.fedoraproject.org/paste/8g6mp2dafypUkFT4sa43iA

xhci traces and dynamic debug could help:

mount -t debugfs none /sys/kernel/debug
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable

echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control

-Mathias
>From 090b13a6df3f489a9781223dd959e03c2f81347b Mon Sep 17 00:00:00 2001
From: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
Date: Thu, 1 Mar 2018 18:48:32 +0200
Subject: [PATCH] xhci: prevent USB 2 roothub autosuspend during port resume
 signaling

xhci USB 2 roothub tries to autosuspended itself again immediately after
being resumed by a remote wake. This can be avoided by calling the
usb_hcd_start_port_resume() and usb_hcd_end_port_resume() implemented
especially for this purpose.

Use them, and prevent roothub autosuspend during resume signaling.

Suggested-by: Anshuman Gupta <anshuman.gupta@xxxxxxxxx>
Signed-off-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
---
 drivers/usb/host/xhci-hub.c  | 3 +++
 drivers/usb/host/xhci-ring.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 72ebbc9..671a336 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -905,6 +905,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
 
 				set_bit(wIndex, &bus_state->resuming_ports);
 				bus_state->resume_done[wIndex] = timeout;
+				usb_hcd_start_port_resume(&hcd->self, wIndex);
 				mod_timer(&hcd->rh_timer, timeout);
 			}
 		/* Has resume been signalled for USB_RESUME_TIME yet? */
@@ -930,6 +931,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
 					msecs_to_jiffies(
 						XHCI_MAX_REXIT_TIMEOUT));
 			spin_lock_irqsave(&xhci->lock, flags);
+			usb_hcd_end_port_resume(&hcd->self, wIndex);
 
 			if (time_left) {
 				slot_id = xhci_find_slot_id_by_port(hcd,
@@ -970,6 +972,7 @@ static u32 xhci_get_port_status(struct usb_hcd *hcd,
 	    (raw_port_status & PORT_PLS_MASK) != XDEV_RESUME) {
 		bus_state->resume_done[wIndex] = 0;
 		clear_bit(wIndex, &bus_state->resuming_ports);
+		usb_hcd_end_port_resume(&hcd->self, wIndex);
 	}
 
 
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index daa94c3..a1cffe9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1666,6 +1666,8 @@ static void handle_port_status(struct xhci_hcd *xhci,
 			bus_state->resume_done[faked_port_index] = jiffies +
 				msecs_to_jiffies(USB_RESUME_TIMEOUT);
 			set_bit(faked_port_index, &bus_state->resuming_ports);
+			usb_hcd_start_port_resume(&hcd->self, faked_port_index);
+
 			/* Do the rest in GetPortStatus after resume time delay.
 			 * Avoid polling roothub status before that so that a
 			 * usb device auto-resume latency around ~40ms.
-- 
2.7.4


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux