Opened 8 years ago

Closed 8 years ago

#7876 closed defect (fixed)

firefox-47.0, was: Firefox crashes often with gcc-6

Reported by: Pierre Labastie Owned by: ken@…
Priority: normal Milestone: 7.10
Component: BOOK Version: SVN
Severity: normal Keywords:
Cc:

Description

From Pierre Bechetoille on blfs-dev: I encountered a new issue with latest firefox compiled with gcc-6.

I had very frequent random crashes. But never in safe mode. I finally found that setting "javascript.options.baselinejit" to false in about:config fixes the crashes.

Googling on Firefox JIT compiler problems, I found that: https://bugzilla.mozilla.org/show_bug.cgi?id=1218925 (crash in js::jit::CodeGenerator::generateBody) and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70526#c14 (GCC 6 miscompiles Firefox JIT compiler)

I don't know if it is a firefox or a gcc bug, nevertheless I recompiled firefox with the patch given in the first link (addition of -fno-schedule-insns2 to CFLAGS and CXXFLAGS in js/src/Makefile.in) and since I have no more crash !

Change History (20)

comment:1 by bdubbs@…, 8 years ago

I think we should hold off on this until version 47.0. Upstream just released 47.0b9 three days ago so I would expect the final 47.0 soon,

Last edited 8 years ago by bdubbs@… (previous) (diff)

comment:2 by bdubbs@…, 8 years ago

Milestone: 7.10hold
Summary: Firefox crashes often with gcc-6Firefox crashes often with gcc-6 (waiting for 47.0)

comment:3 by ken@…, 8 years ago

While waiting, I gave 46.0 a try, using

export CXX='g++ -std=c++11 -fno-lifetime-dse -fno-delete-null-pointer-checks'

Looking at my log, that did pickup my existing CFLAGS of -O2 after that.

There is good news and bad news. The good - on random websites it works as normal. I am using AdBlock Plus and Privacy Badger addons, and this is with gtk+-3.

The bad news is that although playing an HTML-5 video from youtube works, if I then pause the video firefox crashes. Tried on several videos. Using gdb, the bt shows a Python problem:

Thread 1 "firefox" received signal SIGSEGV, Segmentation fault.
0x00007ff197285120 in ?? () from /usr/lib/firefox-46.0/libxul.so
(gdb) bt
#0  0x00007ff197285120 in  () at /usr/lib/firefox-46.0/libxul.so
#1  0x00007ff18bb5f162 in  ()
#2  0x00007ffd0ac73df8 in  ()
#3  0x00007ffd0ac73d20 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
#4  0xffffffffffffffff in  ()#5  0x00007ff150851740 in  ()
#6  0x0000000000000007 in  ()
#7  0x00007ff100000000 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
#8  0xffffffffffffffff in  ()#9  0x00007ffd0ac73d50 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
#10 0xffffffffffffffff in  ()#11 0x00007ff198e0aba0 in  () at /usr/lib/firefox-46.0/libxul.so
#12 0x00007ff18555c9a0 in  ()
#13 0x00007ff181a965d8 in  ()
#14 0x0000000000001101 in  ()
#15 0x00007ffd0ac73dc8 in  ()
#16 0x00007ff161791608 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
(that line repeats 7 times)
#17 0xffffffffffffffff in  ()#18 0xffffffffffffffff in  ()#19 0xffffffffffffffff in  ()#20 0xffffffffffffffff in  ()#21 0xffffffffffffffff in  ()#22 0xffffffffffffffff in  ()#23 0xffffffffffffffff in  ()#24 0xffffffffffffffff in  ()#25 0x00007ff172c41058 in  ()
#26 0x0000000000002f00 in  ()
#27 0x00007ff100000068 in  ()
#28 0x00007ff1508bdd00 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
#29 0xffffffffffffffff in  ()#30 0x000000000000007b in  ()
#31 0x00007ff1508d0f80 in  ()
#32 0x00007ff18bb5f655 in  ()
#33 0x000000000000060b in  ()
#34 0x00007ff150869140 in  ()
#35 0x0000000000000001 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
(this time, 5 repeats)
#36 0xffffffffffffffff in  ()#37 0xffffffffffffffff in  ()#38 0xffffffffffffffff in  ()#39 0xffffffffffffffff in  ()#40 0xffffffffffffffff in  ()#41 0xffffffffffffffff in  ()#42 0x00007ff172c3d60e in  ()
#43 0x0000000000001c00 in  ()
#44 0x00007ff150869140 in  ()
#45 0x0000000000000001 in  ()
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
Python Exception <type 'exceptions.OverflowError'> long too big to convert: 
#46 0xffffffffffffffff in  ()#47 0xffffffffffffffff in  ()#48 0x00007ff16300dc60 in  ()
#49 0x00007ff150869140 in  ()

Looks as if I'll need to rebuild with debugging information and unstripped. Might as well try 46.0.1 just in case.

In theory 47.0 is due this week, and fedora are apparently using it (with a lot of upstream patches),

comment:4 by ken@…, 8 years ago

On the next rebuilds two after that, first I forgot to move from 46.0 to 46.0.1. First time, I added -g to CXXFLAGS and commented out --enable-strip, but forgot to comment out --enable-install-strip. Worked well, but no symbols. Commented that, moved to 46.0.1, still worked well, but still no symbols (e.g. in the moz libs). That matches comment #19 in https://bugzilla.mozilla.org/show_bug.cgi?id=1256687

Then I tried to work out which of those three changes was doing the business. But with any lesser variation, and now even when going back to using all three, it crashes within seconds of starting a video.

I hope 47 fixes it.

comment:5 by ken@…, 8 years ago

47 is out, but it requires autoconf-2.13.

Looking at fedora, they ship 2.13 as well as a modern version, and make firefox depend on it. And they have been doing so since at least 46.0.1. This looks like it is going to hurt.

comment:6 by bdubbs@…, 8 years ago

They haven't done that before. autoconf-2.13 was released in 1999. Are you sure?

comment:7 by ken@…, 8 years ago

I finally managed to download 46.0.1 - with the identical mozconfig that I was using for 47, and the book's instructions, 46 fails fairly quickly. But 47 [ -j8 ] fails after 3.064 seconds.

The strange thing is that client.mk appears to be the file reporting this error, but diffing against 46.0.1 shows that code is not new.

From 47.0.1:

creating ./config.status
js/src> configuring
js/src> running /scratch/ken/firefox-47.0/firefox-build-dir/_virtualenv/bin/python /scratch/ken/firefox-47.0/build/../configure.py --disable-necko-wifi --disable-gstreamer --disable-pulseaudio --disable-gconf --enable-system-sqlite --with-system-libevent --with-system-libvpx --with-system-nss --with-system-icu --prefix=/usr --enable-application=browser --disable-crashreporter --disable-updater --disable-tests --enable-optimize --enable-strip --enable-install-strip --enable-gio --enable-official-branding --enable-safe-browsing --enable-url-classifier --enable-system-ffi --enable-system-pixman --with-pthreads --with-system-bz2 --with-system-jpeg --with-system-png --with-system-zlib --enable-threadsafe --enable-ctypes --disable-shared-js --disable-export-js --with-nspr-cflags=-I/usr/include/nspr --with-nspr-libs=-L/usr/lib -lplds4 -lplc4 -lnspr4 -lpthread -ldl --prefix=/scratch/ken/firefox-47.0/firefox-build-dir/dist --enable-jemalloc --cache-file=/scratch/ken/firefox-47.0/firefox-build-dir/config.cache
js/src> Could not find autoconf 2.13

*** Fix above errors and then restart with\
               "make -f client.mk build"
make[2]: *** [/scratch/ken/firefox-47.0/client.mk:384: configure] Error 1
make[2]: Leaving directory '/scratch/ken/firefox-47.0'
make[1]: *** [/scratch/ken/firefox-47.0/client.mk:396: /scratch/ken/firefox-47.0/firefox-build-dir/Makefile] Error 2
make[1]: Leaving directory '/scratch/ken/firefox-47.0'
make: *** [client.mk:181: build] Error 2

And the relevant lines are:

 $grep 2.13 client.mk 
# try to find autoconf 2.13 - discard errors from 'which'
AUTOCONF ?= $(shell which autoconf-2.13 autoconf2.13 autoconf213 2>/dev/null | grep -v '^no autoconf' | head -1)
AUTOCONF = $(shell which fink >/dev/null 2>&1 && echo `which fink`/../../lib/autoconf2.13/bin/autoconf)
AUTOCONF=$(error Could not find autoconf 2.13)

I had hoped to test this before I went to bed, but I'm giving up and hopefully I'll be doing other things for the next couple of days.

Oh, and when I saw that fedora were using 47 - looks as if they actually use betas (no surprise there, but the commit message made me think upstream had cut the release).

Looks as if I should have commented disable-gstreamer ... that doesn't seem to be what causes that failure.

comment:8 by bdubbs@…, 8 years ago

I downloaded autoconf-2.13 and built it. I just did ./configure && make && sudo make install. It takes all of about 2 seconds. Without a --prefix it installs in /usr/local. These are the file it installs:

-rw-r--r-- 151138 Jun  8 20:08 /usr/local/share/autoconf/autoconf.m4f
-rw-r--r--   7841 Jun  8 20:08 /usr/local/share/autoconf/acconfig.h
-rw-r--r--    526 Jun  8 20:08 /usr/local/share/autoconf/acidentifiers
-rw-r--r--   4065 Jun  8 20:08 /usr/local/share/autoconf/acoldnames.m4
-rw-r--r--   2585 Jun  8 20:08 /usr/local/share/autoconf/autoheader.m4
-rw-r--r--    210 Jun  8 20:08 /usr/local/share/autoconf/acmakevars
-rw-r--r--   1186 Jun  8 20:08 /usr/local/share/autoconf/autoconf.m4
-rw-r--r--    701 Jun  8 20:08 /usr/local/share/autoconf/acheaders
-rw-r--r--   1411 Jun  8 20:08 /usr/local/share/autoconf/acfunctions
-rw-r--r--    334 Jun  8 20:08 /usr/local/share/autoconf/acprograms
-rw-r--r--  83018 Jun  8 20:08 /usr/local/share/autoconf/acspecific.m4
-rw-r--r--  81308 Jun  8 20:08 /usr/local/share/autoconf/acgeneral.m4
-rw-r--r-- 148289 Jun  8 20:08 /usr/local/share/autoconf/autoheader.m4f

-rwxr-xr-x   4907 Jun  8 20:08 /usr/local/bin/autoconf-2.13
-rwxr-xr-x   3173 Jun  8 20:08 /usr/local/bin/autoupdate
-rwxr-xr-x   8640 Jun  8 20:08 /usr/local/bin/autoheader
-rwxr-xr-x   9539 Jun  8 20:08 /usr/local/bin/autoscan
-rwxr-xr-x   2788 Jun  8 20:08 /usr/local/bin/ifnames
-rwxr-xr-x   6050 Jun  8 20:08 /usr/local/bin/autoreconf

-rw-r--r-- 130150 Jun  8 20:08 /usr/local/info/standards.info
-rw-r--r-- 251495 Jun  8 20:08 /usr/local/info/autoconf.info

I did rename autoconf to autoconf-2.13.

The build of FF (at -j4) with the current instructions went without issues:

1116.5 Elapsed Time -  firefox-47.0.source
SBU=12.005
183484 /usr/src/firefox/firefox-47.0.source.tar.xz size (179.183 MB)
4812712 kilobytes build size (4699.914 MB)
md5sum : 0bd5991a6c821dd1a34ead0f8bbb301a  /usr/src/firefox/firefox-47.0.source.tar.xz

I tried to run it over ssh and it came up OK, but the first thing I did was Help->About. It crashed hard:

$ firefox

(firefox:19465): Gdk-WARNING **: gdkproperty-x11.c:325 invalid X atom: 327

(firefox:19465): Gdk-WARNING **: gdkproperty-x11.c:325 invalid X atom: 352
The program 'firefox' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadPixmap (invalid Pixmap parameter)'.
  (Details: serial 11764 error_code 4 request_code 54 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
Assertion failure: 0 == rv, at ptthread.c:292
Redirecting call to abort() to mozalloc_abort

Segmentation fault

I then went over to the development console and ran FF and it did not crash, but I haven't done much more testing then that. www.linuxfromscratch.org does work fine.

Went back to workstation and tried again *over ssh* and it was fine. youtube plays OK, but moving the mouse around caused a segfault.

'firefox --no-remote' seems to work over ssh without segfaults.

comment:9 by ken@…, 8 years ago

Thanks for looking. I'm still baffled about why it used to work with current autoconf.

The link at the bottom of https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Linux_Prerequisites links to a bug which shows 2.13 has been needed since 2001.

This is almost at the start of client.mk, yet for me 46 does not need old autoconf.

Also, I've had a look at gentoo's tarball of patches for 47.0 : nothing relevant to autoconf, and in their ebuild they have WANT_AUTOCONF="2.1"

comment:10 by ken@…, 8 years ago

Two further patches (less important than fixing the build) -

  1. With gtk+-3.20 the sliders in the scrollbars disappear. I noticed that when I was building with gcc-5.3, but put up with it. This week I realised the problem does not exist on my BLFS-7.9 system with gtk-3.18. There is a patch at e.g.

http://pkgs.fedoraproject.org/cgit/rpms/firefox.git/plain/firefox-gtk3-20.patch

  • not sure where it came from.
  1. There is a patch (67K) for using system harfbuzz and graphite2 : I noticed this in

gentoo's patches, the original is at freebsd - https://svnweb.freebsd.org/ports/head/www/firefox-esr/files/patch-bug847568?revision=413726 - the mozilla status seems to be "under review" (but it has been like that for months and was initially opened year ago https://bugzilla.mozilla.org/show_bug.cgi?id=847568

comment:11 by bdubbs@…, 8 years ago

I suspect the reason for not needing autoconf before is that the time stamps on the files made it unneeded. Something changed in 47 to make is needed by default.

in reply to:  11 comment:12 by ken@…, 8 years ago

Owner: changed from blfs-book@… to ken@…
Status: newassigned
Summary: Firefox crashes often with gcc-6 (waiting for 47.0)firefox-47.0, was: Firefox crashes often with gcc-6 (waiting for 47.0)

Replying to bdubbs@…:

I suspect the reason for not needing autoconf before is that the time stamps on the files made it unneeded. Something changed in 47 to make is needed by default.

Yes, they added configure.py :-( Looks as if that is intended to eventually replace the configure script. It has an option to read AUTOCONF from the environment, but when I try that I get

autoconf: error: invalid option `--localdir=/scratch/ken/firefox-47.0/js/src'

So it does indeed now need 2.13. But I don't think that installing it in /usr/local is a good idea - ideally, install the binary as autoconf-2.13 in /usr/bin : I'm going to read Tushar's hint http://www.linuxfromscratch.org/hints/downloads/files/autotools-multiversion.txt

For the moment I'm on gcc-5.3, long way still to go on this. But I suppose I ought to take the ticket, and rename it to 47.

comment:13 by bdubbs@…, 8 years ago

autoconf-2.13 is a shell script 159 line long.

As far as I cna see it only uses autoconf.m4. Which includes acgeneral.m4 and some other m4 files.

I do think Tushar's hint is the way to go, but I wonder if it should be a part of FF or a separate package. Since nothing else we have needs it, I lean toward including it in FF.

comment:14 by Armin K, 8 years ago

Nothing else needs it now. Since it's used by firefox, it's only matter of time before it gets used by other mozilla products which share the same same platform (seamonkey and thunderbird).

in reply to:  8 comment:15 by ken@…, 8 years ago

Replying to bdubbs@…:

The build of FF (at -j4) with the current instructions went without issues:

1116.5 Elapsed Time -  firefox-47.0.source
SBU=12.005
183484 /usr/src/firefox/firefox-47.0.source.tar.xz size (179.183 MB)
4812712 kilobytes build size (4699.914 MB)
md5sum : 0bd5991a6c821dd1a34ead0f8bbb301a  /usr/src/firefox/firefox-47.0.source.tar.xz

That puzzles me - I've had to use a couple of patches from fedora (arch also use them) to get past very early problems with scope in mozalloc. I also threw in the -f switches they are using.

Had it running, seemed stable (youtube was fine) - but lacked the gtk3 slider (I'm using gtk3, it is upstream's default), and also I wanted to try system graphite/harfbuzz.

I tried to run it over ssh and it came up OK, but the first thing I did was Help->About. It crashed hard:

$ firefox

(firefox:19465): Gdk-WARNING **: gdkproperty-x11.c:325 invalid X atom: 327

(firefox:19465): Gdk-WARNING **: gdkproperty-x11.c:325 invalid X atom: 352
The program 'firefox' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadPixmap (invalid Pixmap parameter)'.
  (Details: serial 11764 error_code 4 request_code 54 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)
Assertion failure: 0 == rv, at ptthread.c:292
Redirecting call to abort() to mozalloc_abort

Segmentation fault

I then went over to the development console and ran FF and it did not crash, but I haven't done much more testing then that. www.linuxfromscratch.org does work fine.

Went back to workstation and tried again *over ssh* and it was fine. youtube plays OK, but moving the mouse around caused a segfault.

I've now got that happening on my latest build (segfault if I move the mouse in youtube), I'm attempting to build a debug version trying --disable-strip which fedora are using.

I have also removed the book's sed - not sure if I used that in my earlier manual build, but fedora and Arch seem to lack anything like that (I believe it was an earlier attempt at a fix).

comment:16 by ken@…, 8 years ago

Hmm, I thought that the newest build was good, but it just took longer to trigger - only after I opened a second video in a new tab and tried to switch to that to pause it. Again, it comes down to problems in js/jit, specifically in js::jit::HandleException something eventually throws up the old

Python Exception <type 'exceptions.OverflowError'> long too big to convert:

Unfortunately my Python is stripped, I suppose I need to rebuild that.

Reminder to self: it looks as if strip and install-strip are enabled by default, the --disable forms need to be added to get debugging symbols, also fedora's --enable-release probably override those.

I will also note that the mozilla bug 1218925 linked at the start of this ticket was for ff43 on windows.

comment:17 by ken@…, 8 years ago

Still getting nowhere. I was trying to use system graphite2 and system harfbuzz with the patch. A particular youtube video https://www.youtube.com/watch?v=Y7ILuenEZMs caused a segfault within a second or two.

Without that patch and the associated configure switches, the video started to play but segfaulted when I clicked on full-screen after a few seconds. Retrying, it plays full-screen without problems.

The variability makes me suspect that gcc-6.1 is miscompiling python2.

comment:18 by bdubbs@…, 8 years ago

Milestone: hold7.10
Summary: firefox-47.0, was: Firefox crashes often with gcc-6 (waiting for 47.0)firefox-47.0, was: Firefox crashes often with gcc-6

comment:19 by ken@…, 8 years ago

Summary: the sed currently in the book DOES cover the c++ scope issue. Needs at least a patch for gtk+-3.20. Taking this to -dev to discuss the book's defaults for this version.

comment:20 by ken@…, 8 years ago

Resolution: fixed
Status: assignedclosed

Further comment before I close this - the patch for gtk+-3.20, which is apparently upstream and will probably appear in 49.0, makes copy-and-paste much harder in gtk+-3.18.

Fixed at r17495.

Note: See TracTickets for help on using tickets.