An Overview on Garbage Collector Algorithms

Before we move into the details of algorithms that are used in Java Virtual Machine (JVM), let’s have a common understanding of what GC Roots are. These are the objects which are directly accessible from outside the heap memory.
(1) Active Threads
(2) Static Variables
(3) Local Variables (Accessible via Stack of a Thread)
(4) JNI References

There are three basic types of algorithms that are used in Java Virtual Machine (JVM):-
(1) Mark-sweep
(2) Mark-sweep-compact
(3) Mark-copy

Mark:- All of the algorithms discussed have the same mark phase. Marking phase is about traversing the whole object graph, starting from GC Roots. When GC visits the object, it marks it as accessible and thus alive. All the objects which are not reachable from GC Roots are garbage. Marking requires Stop-The-World (STW) pauses, because the running application threads could interfere. How long the STW pause is, depends mostly on the number of visited objects.

Mark-Sweep:-After marking phase, we have the memory space which is occupied by visited (accessible via GC Roots) and unvisited objects. Sweep phase releases the memory fragments which contains unreachable objects. It is simple, but because the dead objects are not necessarily next to each other, we end up having a fragmented memory. That’s not bad but trying to fit a too large object into the memory could potentially lead to OutOfMemoryError.

Mark-Sweep-Compact:-This algorithm fixes the problem with fragmented memory. After all alive objects are marked, they are moved to the beginning of the memory space. That helps to avoid having too fragmented (split) memory, but compacting the heap isn’t for free. Copying objects and updating all references to them take time and it all happens during STW pause.

Mark-Copy:- Mark-copy algorithm copies all alive objects to a new memory region. The previously occupied region is considered to be free. Next time mark-copy is executed, all the alive objects are moved back to the previous memory region. As you can imagine, this, of course, leads to a memory compaction. Unfortunately, it requires additional extra region large enough to fit all live objects at any given point in time.

Advertisements

Outlining Garbage Collectors

Java Virtual Machine (JVM) heap is divided into two different Generations. One is called Young Generation and the second one is Old Generation (sometimes referred to as Tenured). The Young Generation is further separated into two main logical sections: Eden and Survivor spaces. There are also Virtual spaces for both Young and the Old Generations which are used by Garbage Collectors to resize other regions – mainly to meet different GC goals.

Object Life Cycle
Objects start their journey in Eden space of the Young Generation. When Eden fills up, so called Minor GC is performed: all application threads are stopped (Stop-The-World Pause), objects which are not used anymore are discarded and all other objects from the Eden are moved to the first Survivor space (S0). Next time a Minor GC is performed, the objects goes from S0 to the second Survivor space (S1). All live objects from Eden goes to S1 as well. Notice that it leads to differently aged object in the Survivor space – we have objects from Eden and objects which were already in the Survivor space. Next iteration of Minor GC moves the objects from S1 back to the S0, so the Survivor spaces switch every GC. It is significant to understand the switching of objects in Survivor spaces because when object reaches certain age threshold, it is promoted to the Old Generation. It leads to Survivor space fragmentation (division) which can be easily eliminated with moving all objects from S0 to S1 and back every Minor GC. Eventually, when the Old Generation fills up, a Major GC will be performed on the Old Generation which cleans it up and compacts that space. If and how Stop-The-World pauses occur during Major GC depends on specific GC algorithm used.

Types of Garbage Collectors
Let’s have an overview on three types of garbage collectors:-

(1) Serial Collector: As the name suggests, the collection is performed by only one thread. Stop-the-world (STW) pauses are necessary during both Minor and Full GC. This collector uses the Mark-Copy algorithm for the Young Generation, whereas the Old Generation is cleaned up using the Mark-Sweep-Compact algorithm. Serial GC is designed for single-threaded environments (usually client-class machines) and for relatively small heaps.

(2) Parallel Collector: The Young collection is parallelized by multiple threads which makes Minor GC much faster. As a result, this collector leads to shorter, but more frequent Young collection STW pauses. Since JDK 7u4 [Java 7 update release 4], the Old Generation is also collected by multiple threads by default (and also causes stop-the-world pauses). This collector also uses the Mark-Copy algorithm in the Young Generation and Mark-Sweep-Compact in the Old Generation, but both copy and compact phases are executed by multiple threads.

(3) Concurrent Mark and Sweep Collector: Minor GC is performed with multiple threads using the parallel Mark-Copy algorithm. All application threads are stopped then. The Old Generation is mostly collected concurrently – application threads are paused for very short periods of time when the background GC thread scans the Old Generation. The actual algorithm used during Major GC is concurrent Mark-Sweep. Concurrent Mark and Sweep is the collector which doesn’t compact the Tenured space (Old Generation) and thus the memory can be left fragmented. Due to lack of heap compaction, when GC is not able to fit new objects into the memory, JVM fallbacks to the serial mark-sweep-compact algorithm to defragment and compact the Old Generation. That’s when performance degradation comes into play – all application threads are stopped and just one single thread is responsible for cleaning and compacting the Tenured space.

(4) G1GC (Garbage First Garbage Collector): Garbage First (G1) is a new low-pause garbage collector designed to process large heaps with minimal pauses. The heap is broken down into several regions of fixed size (while still maintaining the generational nature of the heap). That kind of design allows us to get rid of long STW pauses when the entire Young or Old Generations are processed. Now, each region can be collected separately which leads to shorter, but more frequent STW pauses. G1 copies objects from one region into another, which means that the heap is at least partially compacted. G1 uses an incremental version of the Mark-Sweep-Compact algorithm.

PermGen and MetaSpace
PermGen is an abbreviation for Permanent Generation and it’s a special heap space which is separate from the main Java heap where JVM keeps track of metadata of the classes which have been loaded. In Java 8, PermGen has been renamed to Metaspace – with some subtle differences. From dvelopment perspective, it is important to note that Metaspace has unlimited default maximum size. On the contrary, PermGen from Java 7 and earlier has the default maximum size of 64 MB on 32-bit JVM and 82 MB on 64-bit one.

Purpose of [^\x20-\x7E] in Regular Expressions

Purpose of [^\x20-\x7E] in Regular Expressions defines all characters that are not (^) in the range \x20-\x7E (hex 0x20 to 0x7E). According to http://www.asciitable.com/, those are characters from ‘space’ to ‘~’. Hence, given below Java code will work very efficiently:-

String updateMessage = message.replaceAll("[^\\x20-\\x7e]", "");

Happy Coding !!

How to access Tomcat Manager from Eclipse

It has been observed that Tomcat Manager is not accessible after starting the Tomcat Server from Eclipse. Even after doing changes in file – $TOMCAT_HOME/conf/tomcat-users.xml changes are not getting reflected back.

Therefore, we need to configure Eclipse to take control over from Tomcat, we need to do double-click the Tomcat Server entry in Servers tab and then we will get the server configuration. At the left column, under Server Locations, select Use Tomcat installations (If it is grayed-out then delete the Server and add it again). This way Eclipse will take full control over Tomcat and we will be able to access the default Tomcat homepage with the Tomcat Manager when running from inside Eclipse.

Configuration to access Tomcat Manager from Eclipse
Configuration to access Tomcat Manager from Eclipse

Note: Above stated configuration is validated on Apache-Tomcat-8.0.33

How to enable Tomcat Manager

To access the Tomcat Manager page, we need to define users in file – $TOMCAT_HOME/conf/tomcat-users.xml. By default, there is NO user which means that no one can access the Tomcat Manager page.

To access the above mentioned page, we need to add a user(s) as the role “manager-gui”

 
<tomcat-users>
	<role rolename="manager-gui"/>
	<user username="admin" password="admin" roles="manager-gui"/>
</tomcat-users>

Once done then save the file and restart the Tomcat Server and we should able to access the default manager page (http://localhost:8080/manager) with user = “admin” and password = “admin”

Note: Above stated configuration is validated on Apache-Tomcat-8.0.33

Powermock : MockClassLoader Issue with Log4j

When running a PowerMock Test Log4j gives me the following (or something similar) error:-

log4j:ERROR A "org.apache.log4j.xml.DOMConfigurator" object is not assignable to a "org.apache.log4j.spi.Configurator" variable.
log4j:ERROR The class "org.apache.log4j.spi.Configurator" was loaded by
log4j:ERROR [org.powermock.core.classloader.MockClassLoader@14a55f2] whereas object of type
log4j:ERROR "org.apache.log4j.xml.DOMConfigurator" was loaded by [sun.misc.Launcher$AppClassLoader@92e78c].
log4j:ERROR Could not instantiate configurator [org.apache.log4j.xml.DOMConfigurator].

or

Caused by: org.apache.commons.logging.LogConfigurationException:Invalid class loader hierarchy.  
You have more than one version of 'org.apache.commons.logging.Log' visible, which is not allowed.

There are a couple of different solutions to this:
# Upgrade to PowerMock 1.3+
# Make use of the @PowerMockIgnore annotation at the class-level of the test.
For example if using log4j, use `@PowerMockIgnore(“org.apache.log4j.*”)` if using commons logging, use `@PowerMockIgnore(“org.apache.commons.logging.*”)`.

Reference URL :-
http://powermock.googlecode.com/svn-history/r2047/wiki/FAQ.wiki

(403) Forbidden Error Message

There could be a couple of reasons when we can see a (403) Forbidden error message in dynamic web-application. Details for the same could be found at given below URL :-

(1) https://mediatemple.net/community/products/dv/204644980/why-am-i-seeing-a-403-forbidden-error-message

(2) https://en.wikipedia.org/wiki/HTTP_403

However, today in our application we have faced the same scenario wherein we got (403) Forbidden error message while hitting the URL :

http://chauhan:9081/servicesApplication/port/v1_2

After initial investigation, it has been found that due to network/firewall patches, we have to give the full qualified URLs i.e.

Example: http://chauhan.techDomain.com:9081/servicesApplication/port/v1_2 

Enjoy Learning 🙂