Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#118 closed defect (wontfix)

IPOPT segfaults from 3.6 onwards: cause of Termination code 11 error in AMPL

Reported by: paulelastic Owned by: ipopt-team
Priority: high Component: Ipopt
Version: 3.8 Severity: blocker
Keywords: Cc:

Description

I'm using IPOPT 3.8.1 with AMPL, and it's failing on some models but not others. My configuration is Ubuntu 9.04 32-bit, kernel 2.6.31-17. The error is segmentation fault with return code 11 (which AMPL reports as Termination Code 11).

At first, I suspected problems with the ASL, but IPOPT 2.2.1 solved it fine. I've never had problems with IPOPT failing like this until version 3.6 or so. Valgrind on IPOPT 3.8.1 shows that there are memory leaks.

I'm attaching the stub.nl generated from a model that fails. I'm also attaching the summarized valgrind output. The output for 'valgrid --leak-check=full' was too long, so I'm attaching the IPOPT binary so that you can reproduce the results.

================================================================

Some extra information

1) My configure string

./configure --with-blas="/usr/lib/atlas/sse2/libblas.so.3.0" \
                        --with-lapack="/usr/lib/atlas/sse2/liblapack.so.3.0" \
                        --enable-static=yes --enable-shared=no \
                        --with-pardiso="-fopenmp /ampl/lib/libpardiso400_GNU432_IA32.so" \
                        ADD_CXXFLAGS="-fPIC -fexceptions" \
                        ADD_FFLAGS="-fPIC -fexceptions" \
                        ADD_LIBS="/usr/lib/libgomp.so.1" \
                        CC=gcc-4.3 F77=gfortran-4.3 CXX=g++-4.3

2) Output from valgrind

==10370== Memcheck, a memory error detector
==10370== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==10370== Using Valgrind-3.5.0-Debian and LibVEX; rerun with -h for copyright info
==10370== Command: ipopt3 stub.nl
==10370==

==10370== Invalid write of size 8
==10370==    at 0x81EECEE: la_replace (in /ampl/bin/ipopt3)
==10370==  Address 0xd2f1aa00 is not stack'd, malloc'd or (recently) free'd
==10370==
==10370==
==10370== Process terminating with default action of signal 11 (SIGSEGV)
==10370==  Access not within mapped region at address 0xD2F1AA00
==10370==    at 0x81EECEE: la_replace (in /ampl/bin/ipopt3)
==10370==  If you believe this happened as a result of a stack
==10370==  overflow in your program's main thread (unlikely but
==10370==  possible), you can try to increase the size of the
==10370==  main thread stack using the --main-stacksize= flag.
==10370==  The main thread stack size used in this run was 8388608.
==10370==
==10370== HEAP SUMMARY:
==10370==     in use at exit: 3,124,740 bytes in 2,292 blocks
==10370==   total heap usage: 2,481 allocs, 189 frees, 3,130,641 bytes allocated
==10370==
==10370== LEAK SUMMARY:
==10370==    definitely lost: 0 bytes in 0 blocks
==10370==    indirectly lost: 0 bytes in 0 blocks
==10370==      possibly lost: 83,165 bytes in 1,272 blocks
==10370==    still reachable: 3,041,575 bytes in 1,020 blocks
==10370==         suppressed: 0 bytes in 0 blocks
==10370== Rerun with --leak-check=full to see details of leaked memory
==10370==
==10370== For counts of detected and suppressed errors, rerun with: -v
==10370== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 36 from 7)
Segmentation fault

Attachments (2)

stub.tar.bz2 (235.1 KB) - added by paulelastic 11 years ago.
main-bg.gif (365 bytes) - added by Draftmen888 6 years ago.
http://bestrecumbentexercisebike.tumblr.com/

Download all attachments as: .zip

Change History (7)

Changed 11 years ago by paulelastic

comment:1 Changed 11 years ago by paulelastic

I'm sorry, the IPOPT binary is too big to attach to this ticket. It is available here: (for up to 7 days from April 15, 2010). http://www.yousendit.com/download/bFFNblFHSytTRTVjR0E9PQ

If this link expires, please contact me.

comment:2 follow-up: Changed 11 years ago by andreasw

  • Resolution set to wontfix
  • Status changed from new to closed

I got the stub.nl file and attached and could reproduce the error. However, the error occurs in the function la_replace, which is in the AMPL solver library. This is not Ipopt code, and the crash appears to happen before Ipopt does anything meaningful. I suggest you make sure you have the latest version of the ASL, and if that does not help, you could contact the AMPL solver developers (e.g., David Gay).

There is probably nothing in the Ipopt to fix.

comment:3 in reply to: ↑ 2 Changed 11 years ago by paulelastic

Replying to andreasw:

I got the stub.nl file and attached and could reproduce the error. However, the error occurs in the function la_replace, which is in the AMPL solver library. This is not Ipopt code, and the crash appears to happen before Ipopt does anything meaningful. I suggest you make sure you have the latest version of the ASL, and if that does not help, you could contact the AMPL solver developers (e.g., David Gay).

There is probably nothing in the Ipopt to fix.

Hi Andreas,

Thanks for looking into this and for pinpointing the problem. I do have the latest version of the ASL (I ran get.ASL in ThirdParty/?); perhaps I should go back to an older version.

At any rate, I know how to proceed from here. Thanks so much.

comment:4 Changed 11 years ago by paulelastic

[Problem solved] Just a follow-up comment for folks who may be googling for the solution to this problem.

I believe the problem may have something to do with the way the latest gcc (4.4.1) handles symbols in the latest ASL. I created a virtual machine with Ubuntu 8.10 LTS, and used gcc 4.2.4 to compile the IPOPT source with the latest ASL, and it worked like a charm. I copied the compiled binary back to my Ubuntu 9.04 machine (gcc 4.4.1) and IPOPT did not segfault anymore.

Bottom line: use an older compiler.

comment:5 Changed 11 years ago by andreasw

Thanks a lot for the solution report! I added a note at the COIN "current issues" page.

Note: See TracTickets for help on using tickets.