Closed Bug 740242 Opened 12 years ago Closed 12 years ago

runtests.py | Received unexpected exception while running application ''NoneType' object has no attribute 'group'' (mostly in windows debug test mochitests-1/5)

Categories

(Testing :: Mochitest, defect, P3)

x86
Windows Vista
defect

Tracking

(firefox13 fixed, firefox14 fixed)

RESOLVED FIXED
mozilla15
Tracking Status
firefox13 --- fixed
firefox14 --- fixed

People

(Reporter: rnewman, Assigned: ttaubert)

References

Details

(Keywords: intermittent-failure, Whiteboard: [cleanup])

Attachments

(3 files)

I have no idea.

https://tbpl.mozilla.org/php/getParsedLog.php?id=10457230&full=1&branch=mozilla-central

xul!JSD_GetValueForObject+0x00000000000BDBE1
xul!JSD_GetValueForObject+0x000000000008CE76
xul!XRE_RunAppShell+0x0000000000000029
xul!JSD_GetValueForObject+0x0000000000182677
xul!StartupHooker+0x0000000000013835
xul!StartupHooker+0x0000000000013D72
xul!StartupHooker+0x000000000001423D
xul!XRE_InitChildProcess+0x00000000000003E8
0x000000000101140D
0x0000000001011571
0x00000000010118FF
0x000000000101172F
kernel32!BaseThreadInitThunk+0x0000000000000012
ntdll!RtlInitializeExceptionChain+0x0000000000000063
ntdll!RtlInitializeExceptionChain+0x0000000000000036
###!!! ASSERTION: Should have mContext in non-global/non-process manager!: 'mContext || (aChrome && !aParentManager) || aProcessManager', file e:\builds\moz2_slave\m-cen-w32-dbg\build\content\base\src\nsFrameMessageManager.h, line 101
--DOMWINDOW == 406 (19271830) [serial = 644] [outer = 00000000] [url = file:///c:/users/cltbld/appdata/local/temp/tmpmeya7q/mochikit-22.tmp/89419.html]
--DOMWINDOW == 405 (17B89878) [serial = 611] [outer = 00000000] [url = http://mochi.test:8888/tests/docshell/test/chrome/582176_xml.xml]
--DOMWINDOW == 404 (0D52A928) [serial = 609] [outer = 00000000] [url = chrome://mochitests/content/chrome/docshell/test/chrome/bug582176_window.xul]
--DOMWINDOW == 403 (0D52C390) [serial = 600] [outer = 00000000] [url = chrome://mochitests/content/chrome/docshell/test/chrome/bug113934_window.xul?chrome]
--DOMWINDOW == 402 (19272208) [serial = 665] [outer = 00000000] [url = javascript:'1';]
--DOMWINDOW == 401 (17D20940) [serial = 663] [outer = 00000000] [url = data:application/vnd.mozilla.xul+xml,<?xml%20version="1.0"?><window%20xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"/>]
--DOMWINDOW == 400 (19271A28) [serial = 653] [outer = 00000000] [url = http://mochi.test:8888/tests/docshell/test/chrome/92598_nostore.html]
--DOMWINDOW == 399 (19271E18) [serial = 626] [outer = 00000000] [url = file:///c:/users/cltbld/appdata/local/temp/tmpmeya7q/mochikit-21.tmp/662200b.html]
--DOMWINDOW == 398 (17D21778) [serial = 651] [outer = 00000000] [url = chrome://mochitests/content/chrome/docshell/test/chrome/bug92598_window.xul]
--DOMWINDOW == 397 (17B888B8) [serial = 636] [outer = 00000000] [url = data:text/html,<title>new%20load</title>]
--DOMWINDOW == 396 (0D52C598) [serial = 634] [outer = 00000000] [url = chrome://mochitests/content/chrome/docshell/test/chrome/bug690056_window.xul]
--DOMWINDOW == 395 (17D20530) [serial = 642] [outer = 00000000] [url = chrome://mochitests/content/chrome/docshell/test/chrome/bug89419_window.xul]
INFO | runtests.py | Received unexpected exception while running application ''NoneType' object has no attribute 'group''

WARNING: nsExceptionService ignoring thread destruction after shutdown: file e:/builds/moz2_slave/m-cen-w32-dbg/build/xpcom/base/nsExceptionService.cpp, line 199
nsStringStats
 => mAllocCount:           2905
 => mReallocCount:          223
 => mFreeCount:            2905
 => mShareCount:           8464
 => mAdoptCount:            127
 => mAdoptFreeCount:        127
ERROR: The process "1024" not found.
SUCCESS: The process with PID 2716 has been terminated.
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | missing output line for total leaks!
TEST-UNEXPECTED-FAIL | plugin process 3076 | automationutils.processLeakLog() | missing output line for total leaks!
==> process 3376 will purposefully crash
INFO | automationutils.processLeakLog() | process 3376 was deliberately crashed and thus has no leak log

INFO | runtests.py | Running tests: end.
Traceback (most recent call last):
  File "mochitest/runtests.py", line 914, in <module>
  File "mochitest/runtests.py", line 911, in main
  File "mochitest/runtests.py", line 719, in runTests
  File "mochitest/runtests.py", line 610, in cleanup
  File "c:\mozilla-build\python25\Lib\shutil.py", line 174, in rmtree
    onerror(os.remove, fullname, sys.exc_info())
  File "c:\mozilla-build\python25\Lib\shutil.py", line 172, in rmtree
    os.remove(fullname)
WindowsError: [Error 13] The process cannot access the file because it is being used by another process: 'c:\\users\\cltbld\\appdata\\local\\temp\\tmpmeya7q\\animals.sqlite'

command timed out: 1200 seconds without output, attempting to kill
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1
Oh, Windows. Not sure if deploying jhford's tool from bug 727551 will help here unless we also make runtests.py aware of it.
Component: Release Engineering → Release Engineering: Platform Support
Priority: -- → P3
QA Contact: release → coop
Whiteboard: [cleanup]
Based on the log in comment #22 only, is there a problem with the tests failing, leaving runtests.py trying to delete a file locked by firefox still running, and then buildbot fails to delete all the processes ? The last is filed elsewhere.
No idea - my feeling was that it was a bug in runtests.py (or, really, something it includes) failing to properly check whether a regex matched before doing something to one of the groups from the NoneType match, since that's what we usually do when we "'NoneType' object has no attribute 'group'", but I couldn't find one that didn't check.
binaries in hg is sub-optimal, but it's a tiny one and we already have the stackwalk ones in this repo.  Also opsi.
Attachment #614180 - Flags: review?(coop)
Attachment #614180 - Flags: review?(coop) → review+
jmaher, there seems to be multiple issues here so perhaps you can help untangle them. This seems to be typical (from the log in comment #42) but the tests running prior to the fail does vary:

88228 INFO TEST-PASS | /tests/content/media/test/test_fragment_noplay.html | big.wav#t=3-7 fragment test: mozFragmentEnd (9.287981) == end Time (9.287981)
88229 INFO TEST-PASS | /tests/content/media/test/test_fragment_noplay.html | [finished big.wav#t=3-7-4] Length of array should match number of running tests - 1 should equal 1
--DOMWINDOW == 42 (0B053AA8) [serial = 2631] [outer = 00000000] [url = about:blank]
--DOMWINDOW == 41 (1094F7F8) [serial = 2636] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_error_on_404.html]
--DOMWINDOW == 40 (07E5E800) [serial = 2635] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/bogus.ogv]
--DOMWINDOW == 39 (12169808) [serial = 2633] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_error_in_video_document.html]
--DOMWINDOW == 38 (118BCFB0) [serial = 2632] [outer = 00000000] [url = about:blank]
--DOMWINDOW == 37 (107D1348) [serial = 2622] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_delay_load.html]
--DOMWINDOW == 36 (130D3738) [serial = 2619] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_decode_error.html]
--DOMWINDOW == 35 (13A64298) [serial = 2618] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_currentTime.html]
--DOMWINDOW == 34 (11EAC178) [serial = 2617] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_controls.html]
--DOMWINDOW == 33 (11C8F5A8) [serial = 2616] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration7.html]
--DOMWINDOW == 32 (11B6EFC0) [serial = 2615] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration6.html]
--DOMWINDOW == 31 (11885158) [serial = 2614] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration5.html]
--DOMWINDOW == 30 (11F59178) [serial = 2613] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration4.html]
--DOMWINDOW == 29 (1240A0B0) [serial = 2612] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration3.html]
--DOMWINDOW == 28 (10F97218) [serial = 2611] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration2.html]
--DOMWINDOW == 27 (11C27D70) [serial = 2610] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_contentDuration1.html]
--DOMWINDOW == 26 (10F5C068) [serial = 2609] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_constants.html]
--DOMWINDOW == 25 (119830D8) [serial = 2608] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_closing_connections.html]
--DOMWINDOW == 24 (11858890) [serial = 2621] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_defaultMuted.html]
--DOMWINDOW == 23 (11EDC078) [serial = 2620] [outer = 00000000] [url = http://mochi.test:8888/tests/content/media/test/test_decoder_disable.html]
WARNING: Resource read failed: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/wave/nsWaveReader.cpp, line 301
WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705
WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705
WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705
WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705
WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705

### Firefox may have crashed here

INFO | runtests.py | Received unexpected exception while running application 
''NoneType' object has no attribute 'group''

### runtests.py doesn't deal with that very gracefully

WARNING: nsExceptionService ignoring thread destruction after shutdown: file e:/builds/moz2_slave/m-in-w32-dbg/build/xpcom/base/nsExceptionService.cpp, line 199
nsStringStats
 => mAllocCount:         115268
 => mReallocCount:         8310
 => mFreeCount:          115268
 => mShareCount:         136316
 => mAdoptCount:           3683
 => mAdoptFreeCount:       3683
ERROR: The process "1044" not found.
SUCCESS: The process with PID 3968 has been terminated.
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | missing output line for total leaks!
TEST-UNEXPECTED-FAIL | plugin process 1112 | automationutils.processLeakLog() | missing output line for total leaks!

== BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, plugin process 1364

     |<----------------Class--------------->|<-----Bytes------>|<----------------Objects---------------->|<--------------References-------------->|
                                              Per-Inst   Leaked    Total      Rem      Mean       StdDev     Total      Rem      Mean       StdDev
   0 TOTAL                                          26        0      345        0 (   23.80 +/-    18.56)      248        0 (   53.05 +/-    42.14)

nsTraceRefcntImpl::DumpStatistics: 23 entries
TEST-PASS | plugin process 1364 | automationutils.processLeakLog() | no leaks detected!

INFO | runtests.py | Running tests: end.
Traceback (most recent call last):
  File "mochitest/runtests.py", line 914, in <module>
  File "mochitest/runtests.py", line 911, in main
  File "mochitest/runtests.py", line 719, in runTests
  File "mochitest/runtests.py", line 610, in cleanup
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 174, in rmtree
    onerror(os.remove, fullname, sys.exc_info())
  File "c:\mozilla-build\python25\Lib\shutil.py", line 172, in rmtree
    os.remove(fullname)
WindowsError: [Error 13] The process cannot access the file because it is being used by another process: 'c:\\docume~1\\cltbld\\locals~1\\temp\\tmpig9qs_\\Cache\\2\\55\\BC091d01'

### Something is still locking the file from the Firefox crash

command timed out: 1200 seconds without output, attempting to kill
SIGKILL failed to kill process
using fake rc=-1
program finished with exit code -1

### buildbot fails to kill some process too (known issue on windows)
Summary: runtests.py | Received unexpected exception while running application ''NoneType' object has no attribute 'group'' → runtests.py | Received unexpected exception while running application ''NoneType' object has no attribute 'group'' (mostly in windows debug test mochitests-1/5)
Is there a clear action for RelEng here ? Should it go to Core:Testing for debugging ?
Halp! Any insight what's going wrong here ?
Component: Release Engineering: Platform Support → Mochitest
Product: mozilla.org → Testing
QA Contact: coop → mochitest
Version: other → Trunk
My guess is that:
1) Something is failing a regex test and breaking some code, causing an exception to propagate outwards to runtests.py
2) runtests.py then tries to clean up the profile and fails because it hasn't waited for the app to exit, so profile files are still locked.

It's really hard to guess what regex is failing, my best guess is:
http://mxr.mozilla.org/mozilla-central/source/build/automationutils.py#533

We could change the log statement here to be .log.exception:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#703

which would give us a traceback instead of just the exception info, which would help.
Probably fallout from bug 683953, it looks like.
Blocks: 683953
This patch does two things that should improve our diagnostics here:
1) use log.exception if we catch an exception, which will print a full traceback, so we can see where we're actually failing
2) log the output line before passing it off to the leak logger code, so if that code is what's failing, we can see what output it's choking on
Attachment #619937 - Flags: review?(jmaher)
Comment on attachment 619937 [details] [diff] [review]
add some diagnostics to get more info about Python exceptions during Mochitest

Review of attachment 619937 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mochitest/runtests.py
@@ +699,5 @@
>      except KeyboardInterrupt:
>        self.automation.log.info("INFO | runtests.py | Received keyboard interrupt.\n");
>        status = -1
>      except:
> +      self.automation.log.exception("INFO | runtests.py | Received unexpected exception while running application\n")

Nit: Should 'INFO' become 'WARNING' or the like too?
Comment on attachment 619937 [details] [diff] [review]
add some diagnostics to get more info about Python exceptions during Mochitest

Review of attachment 619937 [details] [diff] [review]:
-----------------------------------------------------------------

looks good to me.
Attachment #619937 - Flags: review?(jmaher) → review+
(In reply to Serge Gautherie (:sgautherie) from comment #80)
> Nit: Should 'INFO' become 'WARNING' or the like too?

Seems irrelevant, there's no real convention here, it's purely informational.
https://tbpl.mozilla.org/php/getParsedLog.php?id=11399068&tree=Mozilla-Inbound

--WARNING: Resource read failed: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/wave/nsWaveReader.cpp, line 301
DOMWINDOW == 18 (10E935E0) [serial = 2713] [outer = 00000000] [url = chrome://browser/content/devtools/gcliblank.xhtml]
--DOMWINDOW == 17 (106044E8) [serial = 2712] [outer = 00000000] [url = chrome://browser/content/devtools/gcliblank.xhtml]
--DOMWINDOW == 16 (1WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705
INFO | runtests.py | Received unexpected exception while running application
Traceback (most recent call last):
  File "mochitest/runtests.py", line 698, in runTests
    timeout = timeout)
  File "c:\talos-slave\test\build\mochitest\automation.py", line 900, in runApp
    status = self.waitForFinish(proc, utilityPath, timeout, maxTime, startTime, debuggerInfo, symbolsPath, logger)
  File "c:\talos-slave\test\build\mochitest\automation.py", line 749, in waitForFinish
    logger.log(line)
  File "c:\talos-slave\test\build\mochitest\automationutils.py", line 474, in log
    self._logWindow(line)
  File "c:\talos-slave\test\build\mochitest\automationutils.py", line 508, in _logWindow
    id = self._parseValue(line, "serial")
  File "c:\talos-slave\test\build\mochitest\automationutils.py", line 533, in _parseValue
    return re.search("\[%s = (.+?)\]" % name, line).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
oh-HO! So my guess in comment 77 was correct. We can also see the log line it's choking on:
--DOMWINDOW == 16 (1WARNING: NS_ENSURE_SUCCESS(res, -1) failed with result 0x80004005: file e:/builds/moz2_slave/m-in-w32-dbg/build/content/media/ogg/nsOggReader.cpp, line 705

Why yes, this would totally make us choke! I'm not at all sure what's happening there to make it intermingle that output, that seems really bad. That should all be main-thread, I'd think. In any event, we should handle this more gracefully.
https://hg.mozilla.org/mozilla-central/rev/78e104d007e0
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
(Reopening since just diagnostic patch landed so far aiui)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
https://tbpl.mozilla.org/php/getParsedLog.php?id=11452614&tree=Mozilla-Inbound
Rev3 WINNT 5.1 mozilla-inbound debug test mochitests-3/5 on 2012-05-03 20:38:33 PDT for push a4c02dde700c

++DOMWINDOW == 105 (1156F940) [serial = WARNING: 1 sort operation has occurred for the SQL statement '0x10f5ebd0'.  See https://developer.mozilla.org/En/Storage/Warnings details.: file e:/builds/moz2_slave/m-in-w32-dbg/build/storage/src/mozStoragePrivateHelpers.cpp, line 144
INFO | runtests.py | Received unexpected exception while running application
Traceback (most recent call last):
  File "mochitest/runtests.py", line 698, in runTests
    timeout = timeout)
  File "c:\talos-slave\test\build\mochitest\automation.py", line 900, in runApp
    status = self.waitForFinish(proc, utilityPath, timeout, maxTime, startTime, debuggerInfo, symbolsPath, logger)
  File "c:\talos-slave\test\build\mochitest\automation.py", line 749, in waitForFinish
    logger.log(line)
  File "c:\talos-slave\test\build\mochitest\automationutils.py", line 474, in log
    self._logWindow(line)
  File "c:\talos-slave\test\build\mochitest\automationutils.py", line 508, in _logWindow
    id = self._parseValue(line, "serial")
  File "c:\talos-slave\test\build\mochitest\automationutils.py", line 533, in _parseValue
    return re.search("\[%s = (.+?)\]" % name, line).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Assignee: nobody → ttaubert
Attachment #621018 - Flags: review?(ted.mielczarek)
Comment on attachment 621018 [details] [diff] [review]
make regex parsing more tolerant towards corrupted log lines

Review of attachment 621018 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/automationutils.py
@@ +539,5 @@
>  
>    def _parseValue(self, line, name):
> +    match = re.search("\[%s = (.+?)\]" % name, line)
> +    if match:
> +      return match.group(1)

I think you should explicitly return None here if the regex match fails just to be very clear about what's happening.
Attachment #621018 - Flags: review?(ted.mielczarek) → review+
https://hg.mozilla.org/integration/fx-team/rev/b6e6a24c9e95
Whiteboard: [cleanup][orange] → [cleanup][orange][fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/b6e6a24c9e95
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Whiteboard: [cleanup][orange][fixed-in-fx-team] → [cleanup][orange]
This is a pretty trivial patch, should be easy enough to transplant to Aurora if desired.
Comment on attachment 621018 [details] [diff] [review]
make regex parsing more tolerant towards corrupted log lines

[Approval Request Comment]
Regression caused by (bug #): bug 683953
User impact if declined: Intermittent Mochitest orange
Testing completed (on m-c, etc.): Patch has been running without issue on mozilla-central for a few days.
Risk to taking this patch (and alternatives if risky): Very low-risk, test-harness only.
String changes made by this patch: None
Attachment #621018 - Flags: approval-mozilla-aurora?
Comment on attachment 621018 [details] [diff] [review]
make regex parsing more tolerant towards corrupted log lines

[Approval Request Comment]
See comment 127 for Aurora approval request.
Attachment #621018 - Flags: approval-mozilla-beta?
Comment on attachment 621018 [details] [diff] [review]
make regex parsing more tolerant towards corrupted log lines

test harness only, approved.
Attachment #621018 - Flags: approval-mozilla-beta?
Attachment #621018 - Flags: approval-mozilla-beta+
Attachment #621018 - Flags: approval-mozilla-aurora?
Attachment #621018 - Flags: approval-mozilla-aurora+
Depends on: 758986
Whiteboard: [cleanup][orange] → [cleanup]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: