Tuesday, December 25, 2018

JVM architecture

Understanding JVM internals

Articles:
1. Introduction level : Geeks4geeks and Dzone
2. Little bit deep : Understanding JVM internals
3. JVM specifications : JSE 8

Lessons learned 

Stack-based virtual machine: The most popular computer architectures such as Intel x86 Architecture and ARM Architecture run based on a register. However, JVM runs based on a stack

Symbolic reference: All types (class and interface) except for primitive data types are referred to through symbolic reference, instead of through explicit memory address-based reference.

Network byte order: The Java class file uses the network byte order. To maintain platform independence between the little endian used by Intel x86 Architecture and the big endian used by the RISC Series Architecture, a fixed byte order must be kept. Therefore, JVM uses the network byte order, which is used for network transfer. The network byte order is the big endian.

* The class file itself is a binary file that cannot be understood by a human. To manage this file, JVM vendors provide javap, the disassembler. The result of using javap is called Java assembly.

* JVM follow Delegation-Hierarchy principle to load classes. System class loader delegate load request to extension class loader and extension class loader delegate request to boot-strap class loader. If class found in boot-strap path, class is loaded otherwise request again transfers to extension class loader and then to system class loader. At last if system class loader fails to load class, then we get run-time exception java.lang.ClassNotFoundException.

*The method area can be implemented in various formats by JVM vendor. Oracle Hotspot JVM calls it Permanent Area or Permanent Generation (PermGen). The garbage collection for the method area is optional for each JVM vendor.

* The bytecode that is assigned to the runtime data areas in the JVM via class loader is executed by the execution engine. The execution engine reads the Java Bytecode in the unit of instruction. It is like a CPU executing the machine command one by one

* Interpreter: Reads, interprets and executes the bytecode instructions one by one. As it interprets and executes instructions one by one, it can quickly interpret one bytecode, but slowly executes the interpreted result. This is the disadvantage of the interpret language. The 'language' called Bytecode basically runs like an interpreter.

* JIT (Just-In-Time) compiler: The JIT compiler has been introduced to compensate for the disadvantages of the interpreter. The execution engine runs as an interpreter first, and at the appropriate time, the JIT compiler compiles the entire bytecode to change it to native code. After that, the execution engine no longer interprets the method, but directly executes using native code. Execution in native code is much faster than interpreting instructions one by one. The compiled code can be executed quickly since the native code is stored in the cache.

*It takes more time for JIT compiler to compile the code than for the interpreter to interpret the code one by one. Therefore, if the code is to be executed just once, it is better to interpret it instead of compiling. Therefore, the JVMs that use the JIT compiler internally check how frequently the method is executed and compile the method only when the frequency is higher than a certain level.

* The Hotspot VM is divided into the Server VM and the Client VM, and the two VMs use different JIT compilers.


Sunday, December 23, 2018

Java String

Java String class properties
- immutable : makes thread safe
- final
- hash code is cached
- String pool is possible to implement
Why string is final i java ? link

StringBuffer vs StringBuilder
StringBuffer and StringBuilder are mutable classes. StringBuffer operations are thread-safe and synchronized where StringBuilder operations are not thread-safe. So when multiple threads are working on same String, we should use StringBuffer but in single threaded environment we should use StringBuilder. StringBuilder performance is fast than StringBuffer because of no overhead of synchronization.

String pool
- Introduction article. link Baeldung
- Performance of String.intern() - link. Tip from article :  use prime number for -XX:StringTableSize=N, like N =1,000,003 instead of 1,000,000

Oracle String.intern()
Returns a canonical representation for the string object.
A pool of strings, initially empty, is maintained privately by the class String.
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.
All literal strings and string-valued constant expressions are interned. String literals are defined in section 3.10.5 of the The Java™ Language Specification.
Returns:
a string that has the same contents as this string, but is guaranteed to be from a pool of unique strings.

Substring memory leak problem
- How caused and fix : link 
- Problem fixed after Java 7u6 version. link.(broken) , new link
* The original assumptions around the String object implementing a flyweight pattern are no longer regarded as valid.


Friday, December 21, 2018

Java references

In java we have 4 types of references
  • Strong
  • Weak
  • Soft
  • Phantom
Soft vs Weak vs Phantom References
TypePurposeUseWhen GCed
Strong An ordinary reference. Keeps objects alive as long as they are referenced.normal reference.Any object not pointed to can be reclaimed.
Soft Keeps objects alive provided there’s enough memory.to keep objects alive even after clients have removed their references (memory-sensitive caches), in case clients start asking for them again by key.After a first gc pass, the JVM decides it still needs to reclaim more space.
Weak Keeps objects alive only while they’re in use (reachable) by clients.Containers that automatically delete objects no longer in use.After gc determines the object is only weakly reachable
Phantom Lets you clean up after finalization but before the space is reclaimed (replaces or augments the use of finalize())Special clean up processingAfter finalization.

Articles to read

Baeldung :  weak, soft, phantom
Kdgregory : java references