Java Core Dump

java

https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/toc.html
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/bugreports004.html#CHDHDCJD - done reading
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/felog003.html#BABFFJBB
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/felog004.html#BABFIIJH
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/felog005.html#BABIBEJD
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/felog006.html#BABFJBCC
http://javaeesupportpatterns.blogspot.com/p/thread-dump-analysis.html

https://www.youtube.com/watch?v=78hvWy_c88Y - thread state in java thread dump analysis - good
https://www.youtube.com/watch?v=1qzHSEjU8Hc - Thread Dump Analysis Fundamentals - Part 1 - Pierre-Hugues Charbonneau - done watching
https://www.youtube.com/watch?v=H34ZEkrJV1k - Thread Dump Analysis Fundamentals - Part 2 - Pierre-Hugues Charbonneau
https://www.youtube.com/watch?v=1qzHSEjU8Hc&list=PLeLNWvESQ0GaJv8VCelD0bXiTIcVCRuSC - Pierre-Hugues Charbonneau

https://www.youtube.com/watch?v=UnaNQgzw4zY
https://www.youtube.com/watch?v=YQgmF8I-zhk
https://www.youtube.com/watch?v=yxtsPoe-beY
https://www.youtube.com/watch?v=kQpkjCUQvEc
https://www.youtube.com/watch?v=ZBJ0u9MaKtM
https://www.youtube.com/watch?v=VuahrRY0TgU
https://www.youtube.com/watch?v=Bnq3gwQdUqg
https://www.youtube.com/watch?v=Ta9fyS_VMA8
https://www.youtube.com/watch?v=PSZ-9NOaMq8
https://www.youtube.com/watch?v=PUUJ4rNpRhU
https://www.youtube.com/watch?v=FLcXf9pO27w
https://www.youtube.com/watch?v=5joejuE2rEM

https://www.youtube.com/watch?v=0Xt-au2QnRg
https://www.youtube.com/watch?v=igc5_JXHZDY
https://www.youtube.com/watch?v=jgIX5Q03yfs
https://www.youtube.com/watch?v=gz4LFnOstes
https://www.youtube.com/watch?v=vwGJesp4ofU
https://www.youtube.com/watch?v=JRhXFVmkqNU&list=PLDui10DbquTIJElXW5D6pV1KQVJaIp1Jd
https://www.youtube.com/watch?v=YZrN8jpyp1k
https://www.youtube.com/watch?v=FLcXf9pO27w
https://www.youtube.com/watch?v=5joejuE2rEM
https://www.youtube.com/watch?v=ok-duNfq8Kk
https://www.youtube.com/watch?v=_shKdU7mGxs
https://www.youtube.com/watch?v=yxtsPoe-beY
https://www.youtube.com/watch?v=XIosIzFTREo
https://www.youtube.com/watch?v=h8mtQ1FyoA4
https://www.youtube.com/watch?v=jd6dJa7tSNU
https://www.youtube.com/watch?v=23ZKGT4nk9I
https://www.youtube.com/watch?v=mlfz6c9frSU
https://www.youtube.com/watch?v=3lwNLD-vHtU

https://www.youtube.com/watch?v=3MsiN27xE4c
https://www.youtube.com/watch?v=4M06v-2h7E8
https://www.youtube.com/watch?v=Vp_wdZ-7omg
https://www.youtube.com/watch?v=5cbtNDGxxDs

https://www.youtube.com/watch?v=MmFuWmzeiDs

How can we enable core dump?

To enable core dumping, try "ulimit -c unlimited" before starting Java again.

The ulimit utility is used to get or set the limitations on the system resources available to the current shell and its descendants. Use the ulimit -c command to check or set the core file size limit. Make sure that the limit is set to unlimited; otherwise the core file could be truncated.

ulimit is a Bash shell built-in command; on a C shell, use the limit command.

Where can we find the core dump file?

On the Linux operating system, unhandled signals such as segmentation violation, illegal instruction, and so forth, result in a core dump. By default, the core dump is created in the current working directory of the process and the name of the core dump file is core.pid, where pid is the process id of the crashed Java process.

How can we take a core dump on Linux?

To get the list of Java processes running on the machine, you can use any of the following commands:

ps -ef | grep java
pgrep java
jps

The jps command-line utility does not perform name matching (that is, looking for "java" in the process command name) and so it can list Java VM embedded processes as well as the Java processes.

The following is one option to collect core dumps on Linux.

ShowMessageBoxOnError option in Linux: A Java process can be started with the -XX:+ShowMessageBoxOnError command-line option. When a fatal error is encountered, the process prints a message to standard error and waits for a yes or no response from standard input. Example 17-3 shows the output when an unexpected signal occurs.

Example 17-3 Unexpected Signal Error in Linux

=======================================================================
Unexpected Error
-----------------------------------------------------------------------
SIGSEGV (0xb) at pc=0x06232e5f, pid=11185, tid=8194
Do you want to debug the problem?
To debug, run 'gdb /proc/11185/exe 11185'; then switch to thread 8194
Enter 'yes' to launch gdb automatically (PATH must include gdb)
Otherwise, press RETURN to abort...

Type yes to launch the gdb (GNU Debugger) interface, as suggested by the error report shown above. In the gdb prompt, you can give the gcore command. This command creates a core dump of the debugged process with the name core.pid, where pid is the process ID of the crashed process. Make sure that the gdb gcore command is supported in your versions of gdb. Look for help gcore in the gdb command prompt.

In order to generate a dump, we need to know the PID of the Java program. The above commands help us determine the PID of the Java program. Different ways to generate a dump:

  1. kill -3 PID
  2. jstack PID
  3. Java Visual VM
  4. Weblogic Admin Console - Monitory tab

jstack is part of JDK. Some production environment only have JRE installed, and therefore jstack is not available. In such cases, we can use the kill -3 PID command to generate the dump. With the kill -3 PID command, the dump is sent to the console / terminal that is running the Java program, which is typically not the console that we are currently using. For Tomcat, the dump is output to the catalina.out file. With the jstack PID command, the dump is output to the console that we use to run the jstack command. This is a bit of a convenient for us. However, if the machine is struggling, using the kill -3 PID command may be a faster way to generate the dump. Java Visual VM is nice but, like jstack, it is also a bit resource-intensive compared to the kill -3 PID command.

With the WebLogic Admin Console - Monitoring tab, WebLogic tries to be helpful by reducing some information, but sometime, that information happens to be what we need. In that case, we just have to fall back to using the jstack PID command , or kill -3 PID command.

We should have a way to use New Relic or some other tools to automatically detect deadlock, database connection leaks, data leak, and automatically send these problematic dumps to us.

How can we specify the location of Fatal Error Log?

To specify where the log file will be created, use the product flag -XX:ErrorFile=file, where file represents the full path for the log file location. The substring %% in the file variable is converted to %, and the substring %p is converted to the PID of the process. In the following example, the error log file will be written to the directory /var/log/java and will be named java_errorpid.log:

java -XX:ErrorFile=/var/log/java/java_error%p.log

If the -XX:ErrorFile=file flag is not specified, the default log file name is hs_err_pid.log, where pid is the PID of the process. In addition, if the -XX:ErrorFile=file flag is not specified, the system attempts to create the file in the working directory of the process. In the event that the file cannot be created in the working directory (insufficient space, permission problem, or other issue), the file is created in the temporary directory for the operating system. On Oracle Solaris and Linux operating systems the temporary directory is /tmp. On Windows the temporary directory is specified by the value of the TMP environment variable; if that environment variable is not defined, the value of the TEMP environment variable is used.

What is a core dump?

A core dump or a crash dump is a memory snapshot of a running process. A core dump can be automatically created by the operating system when a fatal or unhandled error (for example, signal or system exception) occurs. Alternatively, a core dump can be forced by means of system-provided command-line utilities. Sometimes a core dump is useful when diagnosing a process that appears to be hung; the core dump may reveal information about the cause of the hang.

When collecting a core dump, be sure to gather other information about the environment so that the core file can be analyzed (for example, OS version, patch information, and the fatal error log).

Core dumps do not usually contain all the memory pages of the crashed or hung process. With each of the operating systems discussed here, the text (or code) pages of the process are not included in core dumps. But to be useful, a core dump must consist of pages of heap and stack as a minimum. Collecting non-truncated good core dump files is essential for postmortem analysis of the crash.

What are the reasons for not getting a core file?

The following list explains the major reasons that a core file might not be generated. This list pertains to both Oracle Solaris and Linux operating systems, unless specified otherwise.

  1. The current user does not have permission to write in the current working directory of the process.
  2. The current user has write permission on the current working directory, but there is already a file named core that has read-only permission.
  3. The current directory does not have enough space or there is no space left.
  4. The current directory has a subdirectory named core.
  5. The current working directory is remote. It might be mapped by NFS (Network File System), and NFS failed just at the time the core dump was about to be created.
  6. Oracle Solaris operating system only: The coreadm tool has been used to configure the directory and name of the core file, but any of the above reasons apply for the configured directory or filename.
  7. The core file size limit is too low. Check your core file limit using the ulimit -c command (Bash shell) or the limit -c command (C shell). If the output from this command is not unlimited, the core dump file size might not be large enough. If this is the case, you will get truncated core dumps or no core dump at all. In addition, ensure that any scripts that are used to launch the VM or your application do not disable core dump creation.
  8. The process is running a setuid program and therefore the operating system will not dump core unless it is configured explicitly.
  9. Java specific: If the process received SIGSEGV or SIGILL but no core dump, it is possible that the process handled it. For example, HotSpot VM uses the SIGSEGV signal for legitimate purposes, such as throwing NullPointerException, deoptimization, and so forth. The signal is unhandled by the Java VM only if the current instruction (PC) falls outside Java VM generated code. These are the only cases in which HotSpot dumps core.
  10. Java specific: The JNI Invocation API was used to create the VM. The standard Java launcher was not used. The custom Java launcher program handled the signal by just consuming it and produced the log entry silently. This situation has occurred with certain Application Servers and Web Servers. These Java VM embedding programs transparently attempt to restart (fail over) the system after an abnormal termination. In this case, the fact that a core dump is not produced is a feature and not a bug.

How can we collect core dump on Windows?

On Windows operating system there are three types of crash dumps:

  1. Dr. Watson logfile, which is a text error log file that includes faulting stack trace and a few other details.
  2. User minidump, which can be considered a "partial" core dump. It is not a complete core dump, because it does not contain all the useful memory pages of the process.
  3. Dr. Watson full-dump, which is equivalent to a Unix core dump. This dump contains most memory pages of the process (except for code pages).

When an unexpected exception occurs on Windows, the action taken depends on two values in the following registry key: \\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\AeDebug

The two values are named Debugger and Auto. The Auto value indicates if the debugger specified in the value of the Debugger entry starts automatically when an application error occurs.

  1. A value of 0 for Auto means that the system displays a message box notifying the user when an application error occurs.
  2. A value of 1 for Auto means that the debugger starts automatically.

The value of Debugger is the debugger command that is to be used to debug program errors. When a program error occurs, Windows examines the Auto value and if the value is 0 it executes the command in the Debugger value. If the value for Debugger is a valid command, a message box is created with two buttons: OK and Cancel. If the user clicks OK, the program is terminated. If the user clicks Cancel, the specified debugger is started. If the value for the Auto entry is set to 1 and the value for the Debugger entry specifies the command for a valid debugger, the system automatically starts the debugger and does not generate a message box.

The following are two ways to collect crash dump on Windows:

  1. Configure Dr.Watson
  2. Force a crash dump

How can we configure Dr.Watson to take a crash dump on Windows?

The Dr. Watson debugger is used to create crash dump files. By default, the Dr. Watson debugger (drwtsn32.exe) is installed into the Windows system folder (%SystemRoot%\System32). To install Dr. Watson as the postmortem debugger, run the following command: drwtsn32 -i

To configure name and location of crash dump files, run drwtsn32 without any options. In the Dr. Watson GUI window, make sure that the Create Crash Dump File check box is selected and that the crash dump file path and log file path are configured in their respective text fields. Dr. Watson may be configured to create a full dump using the registry. The registry key is shown in Example 17-4.

System Key: [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\DrWatson]
Entry Name: CreateCrashDump
Value: (0 = disabled, 1 = enabled)

Note: If the application handles the exception, then the registry-configured debugger is not invoked. In that case it might be appropriate to use the -XX:+ShowMessageBoxOnError command-line option to force the process to wait for user intervention on fatal error conditions.

How can we force a crash dump on Windows?

On the Windows operating system, the userdump command-line utility can be used to force a Dr. Watson dump of a running process. The userdump utility does not ship with Windows but instead is released as a component of the OEM Support Tools package. An alternative way to force a crash dump is to use the windbg debugger. The main advantage of using windbg is that it can attach to a process in a non-invasive manner (that is, read-only). Normally Windows terminates a process after a crash dump is obtained but with the non-invasive attach it is possible to obtain a crash dump and let the process continue. To attach the debugger non-invasively requires selecting the Attach to Process option and the Noninvasive checkbox.

When the debugger is attached, a crash dump can be obtained using the command shown:

.dump /f crash.dmp

The windbg debugger is included in the "Debugging Tools for Windows" download. An additional utility in this download is the dumpchk.exe utility, which can verify that a memory dump file has been created correctly. Both userdump.exe and windbg require the pid of the process. The userdump -p command lists the process and program for all processes. This is useful if you know that the application is started with the java.exe launcher. However, if a custom launcher is used (embedded VM), it might be difficult to recognize the process. In that case you can use the jps command-line utility as it lists the pids of the Java processes only.

As with Oracle Solaris and Linux operating systems, you can also use the -XX:+ShowMessageBoxOnError command-line option on Windows. When a fatal error is encountered, the process shows a message box and waits for a yes or no response from the user. Before clicking Yes or No, you can use the userdump.exe utility to generate the Dr. Watson dump for the Java process. This utility can also be used in cases when the process appears to be hung.

Examples:

WAITING: a thread is in the waiting state and this wait is over only when some other thread perform some appropriate action.

TIME_WAITING: Thread.sleep, Thread.join

BLOCKED: this state represents a thread which has been blocked and is waiting for a monitor to enter / re-enter a synchronized block / method. A thread gets into this state after calling Object.wait method.

public class ThreadStateExample implements Runnable {
    public void run() {
        if (Thread.currentThread().getName().equals("Thread-1")) {
            t1();
        else {
            t1();
        }
    }
    private synchronized void t1() {
        while (true) {
            try {
                wait();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }

    public static void main(String[] args) {
        ThreadStateExample someVariableName = new ThreadStateExample();
        Thread t1 = new Thread(someVariableName, "Thread-1");
        Thread t2 = new Thread(someVariableName, "Thread-2");
        t1.start();
        t2.start();
    }
}

----

public class ThreadStateExample implements Runnable {
    private Object object = new Object();
    private Object object2 = new Object();

    public void run() {
        if (Thread.currentThread().getName().equals("Thread-1")) {
            synchronized (object) {
                try {
                    Thread.sleep(100);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                synchronized (object1) {
                    t1();
                }
            }
        else {
            synchronized (object1) {
                try {
                    Thread.sleep(100);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                synchronized (object) {
                    t1();
                }
            }
        }
    }
    private synchronized void t1() {
        while (true) {
        }
    }

    public static void main(String[] args) {
        ThreadStateExample someVariableName = new ThreadStateExample();
        Thread t1 = new Thread(someVariableName, "Thread-1");
        Thread t2 = new Thread(someVariableName, "Thread-2");
        t1.start();
        t2.start();
    }
}
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License