Thursday, May 28, 2020

Java Multithreading

Why we need Threads?

Responsiveness - can be achieved with Concurrency (Multitasking)
Performance - can be achieved with Parallelism

Context Switching

Context switching is expensive
Context switching between threads is a lot cheaper than context switching between processes
Too many threads - OS spending more time in management than real productive work
Thread consuming less resources than processes.

Thread scheduling

There are different possible of ways to schedule

First Come First Serve - problem with that if long threads come first other thread will be unresponsiveness, it is called starvation
Short Job First - this time longest job will wait
Epochs - OS divides CPU time to moderately sized pieces called Epochs. OS allocates different time for each thread in each Epoch. It is done according to Dynamic Priority calculations.

Thread creation & it's methods

Two way of creating threads

Implement Runnable interface provide in construction of Thread object
Extend Thread object

Number of threads should be equal to number of cores in machines
Use thread.setUncaughtExceptionHandler to catch unchecked exceptions during run-time.
You can either clean up resources or log the issue for trouble shooting purposes
Stopping thread from another thread has two ways

Thread.interrupt() - you can interrupt the thread in two scenarios

If the thread is executing a method that throws an InterruptedException
If the thread code is handing the interrupt signal explicitly

Daemon threads - background threads that do not prevent the application from exiting if the main thread terminates. Other reason , code in a worked thread is not under our control, and we do not want it to block our application from terminating

By default, at least if one thread is running application will not stop even main thread stopped. So we need to stop all threads gracefully
Thread.join()

calling the join() method has a synchronization effect. join() creates a happens-before relationship
Happens-before : This means that when a thread t1 calls t2.join(), then all changes done by t2 are visible in t1 on return. However, if we do not invoke join() or use other synchronization mechanisms, we do not have any guarantee that changes in the other thread will be visible to the current thread even if the other thread has completed.
When we invoke the join() method on a thread, the calling thread goes into a waiting state. It remains in a waiting state until the referenced thread terminates.
Timed join() is dependent on the OS for timing. So, we cannot assume that join() will wait exactly as long as specified.

In order to avoid creation/destroy of threads there is thread pooling mechanisms.

Data Sharing between Threads

Thread local variables are stored in stack . Like local variable and local object references
Shared information stored in Heap. Like Objects, class members and static variables
Critical section guarded with synchronized keyword. Two ways of doing this

synchronized on method level - Monitor

synchronized inside method with explicit object - lock

Re-entrant - thread in synchronized method/section can access to other synchronized method/section

Atomic Operations

Object reference assignment - including getter, setter for exmaple
Primitive type assignments except long and double. Because long and double 64 bit long
We can define long and double volatile. With volatile they are guaranteed in single HW operation
Knowledge of atomic operations is key to us create high performance applications

Concurrency problems

Race condition : two threads working on same shared object. One of them modifying the object , due to OS scheduling it may cause incorrect results. Core of the problem is non-atomic operation performed on shared object . Solution - identifying the critical section where race condition happened and protecting with synchronized block.
https://stackoverflow.com/questions/34510/what-is-a-race-condition
Data race : solution, establish happens-before semantics by one of these methods

synchronization of method
using volatile. No compiler re-ordering will happen. whatever code before and after volatile will run as is.

Locking Strategies

Coarse-grained strategy : lock whole object. Might impact the performance
Fine-grained strategy : lock party of shared objects using lock object

Deadlock

Condition to leads to deadlock

Mutual exclusion
Hold and wait
Non-preemptive allocation
Circular wait

Solution to deadlock is avoid one the conditions mentioned above

Avoid circular wait - this one easiest one.

Deadlock detection

Watchdog
Thread interruption
tryLock operation

Reentrant Lock

Similar locking with synchronized locking but provides more control over lock with advanced operations
Pattern to use it
class SharedData{

private Lock lockObject = new ReenterantLock();

public void method(){

lockObject.lock();

try{

userSharedObject();

}finally(){

lockObject.unlock();

}

}

}
In order to avoid starvation - one thread is continuously using shared object but other are waiting - you can set true into constructor of ReenterantLock(true) object. which is fairness flag. But this one comes with cost. Use only when you really need it.
ReenterantLock.lockInterrupility()
ReenterantLock.tryLock()
ReenterantReadWriteLock - if our shared object is read intensive we can use it otherwise it can perform worse then traditional locks. Example of using read-write lock is caching where system is read intensive. Multiple read threads can access the shared object and lock it, we can see number of concurrent read threads. Only one write thread can lock the shared object no other write/read threads can access during write lock.

Semaphore

Can restrict number of threads accessing to shared data.
similar to lock but different in many ways.
One use case if Producer-Consumer using semaphore. Producer-consumer pattern used in web sockets, video streaming, Actor models

Condition variable

Other methods

wait
notify() and notifyAll()

Lock free programming

AtomicInteger, AtomicLong...
AtomicReferences

Reference

https://www.udemy.com/course/java-multithreading-concurrency-performance-optimization/

Tuesday, May 26, 2020

Java Exceptions

All RuntimeExceptions are unchecked exceptions rest of them are checked exceptions
Always use try-with-resource. In order to use objects should implement AutoClosable
Exceptions are very slowly. Code running inside try-catch is performing slowly. If you have chance just use simple test (like if(!s.empty) s.pop() ) rather then guarded section
Throw early, catch late

Sources :

https://www.manishsanger.com/java-exception-hierarchy/

https://examples.javacodegeeks.com/java-throw-exception-example/

Java hashCode()

Objects that are equal must have the same hash code within a running process
Whenever you implement equals, you MUST also implement hashCode
Whenever two different objects have the same hash code, we call this a collision.
A collision is nothing critical, it just means that there is more than one object in a single
bucket, so a HashMap lookup has to look again to find the right object. A lot of collisions will degrade the performance of a system, but they won’t lead to incorrect results.
It is good to generate same hash code in different execution of programs but you should not relay on this. String and Integer are generating same hash code always will be same.But while most of the hashCode implementations provide stable values, you must not rely on it.here are Java libraries that actually return different hashCode values in different processes and this tends to confuse people. Google’s Protocol Buffers is an example.
Do not use hashCode in distributed applications
You may know that cryptographic hash codes such as SHA1 are sometimes used to identify objects (Git does this, for example). Is this also unsafe? No. SHA1 uses 160-bit keys, which makes collisions virtually impossible. Even with a gigantic number of objects, the odds of a collision in this space are far below the odds of a meteor crashing the computer that runs your program. This article has a great overview of collision probabilities.
A cryptographic hash such as MD5 or SHA-1 would be ok for many cases, but might be a bit heavyweight if you’re dealing with a really high-throughput service.

Sources

https://eclipsesource.com/blogs/2012/09/04/the-3-things-you-should-know-about-hashcode/

https://martin.kleppmann.com/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html

How does HashMap works in Java

Array created with default capacity of 16
Then getting hash code of the key
It rehashes the hash code to prevent against a bad hashing function from the key that would put all data in the same index (bucket) of the inner array
It takes the rehashed hash hashcode and bit-masks it with the length (minus 1) of the array. This operation assures that the index can’t be greater than the size of the array. You can see it as a very computationally optimized modulo function.

Finding appreciate array index according to hash code and saving in bucket associated with this index
In Java 8 , if bucket size more than 8 automatically converting that bucket from linked list to read black tree
Can auto size the map according to load factor. Initial arrays size is 16 and load factor is 0.75
HashMap is not thread safe but HashTable is thread safe but locks whole data structure during concurrent access. On the other hand , ConcurrentHashMap is locking only bucket
Mostly Integer and String used as map key because they immutable and provide string hash code function
If you have too many data to put on Map , it is advisable to create map with approximate high initial capacity. Because there is additional overhead of shrinking the map

Source : http://coding-geek.com/how-does-a-hashmap-work-in-java/

Youtube : link

Iterable Interface

The Iterable interface is the root interface for all the collection classes because the Collection interface extends the Iterable interface, therefore, all the subclasses of Collection interface also implement the Iterable interface.

The iterable interface contains only one abstract method.

Iterator iterator(): It returns the iterator over the elements of type T.

Iterator Interface

The iterator interface provides the facility of iterating the elements in a forward direction only.

public interface Iterator<E>{
E next();

boolean hasNext();

void remove();

default void forEachRemaining(Consumer<? super E> action);

}

Collection Interface

The Collection interface builds the foundation for the Collection framework. The collection interface is one of the interfaces which is implemented by all the Collection framework classes. It provides common methods to implement by all the subclasses of collection interfaces.

public interface Collection<E>{

boolean add(E element)
Iterator<E> iterator()
int size()boolean isEmpty()
boolean contains(Object obj)
boolean containsAll(Collection<?> c)
boolean equals(Object other)
boolean addAll(Collection<? extends E> from)
boolean remove(Object obj)
boolean removeAll(Collection<?> c)
void clear()
boolean retainAll(Collection<?> c)
Object[] toArray()
<T> T[] toArray(T[] arrayToFill)
...............
}

Concrete Collections

List Interface

ArrayList and LinkedList implements this interface. get and set methods can be works different in

terms of performance due to nature of array and list data structure. Java language designer added

RandomAccess tagging interface in order to distinguish between these two

public interface List<E>{

void add(int index, E element)

void remove(int index)

E get(int index)

E set(int index, E element)

}

Set Interface

* Usually implemented by HashSet and TreeSet classes

* TreeSet visits elements in sorted order

* In HashSet if someone providing poor hashing algorithm then it can be slower. On the other hand TreeSet performance guaranteed. But you have to provide Compactor or implement compareTo method

Queue Interface

* Queue let you efficiently add at the tail and remove from head

* Deque can add/remove on both ends

* Priority Queue isn't queue

- doesn't remember in which order elements were added

- when removed , highest priority elements were removed

- useful for work scheduling

Concurrent Modification

Suppose one iterator traverses a collection and another modifies the collection by add/removing the

element. in the case of linked list , that won't work - the links will not be consistent. Linked list

detects the concurrent modification and throws ConcurrentModificationException

In order to understand to which collection have modification count you need to check the java API

documentation. This is also sometimes called fail-fast

Fail Fast vs Fail Safe

Reference - 1

Reference - 2

Maps

* HashMap hashes the keys, TreeMap organizes them in sorted order

* map.get(id) can return null if not exists. Then you need to check the value. In order to avoid you can use map.getOrElse(id, $value) if key absent returns $value

* Easiest way to iterate over map : map.forEach ( (k,v)-> doSomething )

* Updating map entries

map.put(word, map.get(word) +1 )
If key is not present then you can use map.put(word, map.getOrDefault(word,0 ) + 1 )
map.putIfAbsent(word, 0) then map.put(word, map.get(word) +1)
map.merge(word, 1 , Integer::sum) If word wasn't present, put 1 . Otherwise , put them sum of 1 and previous value
Efficient map.forEach( (k,v) -> do something with k,v )

* LinkedHashMap traverses the entries in the other which they were added

Views

* A view implements a collection interface without storing the elements. Examples :

Collection<String> greetings = Collections.nCopies(100,"Hello"); // create illusion of 100 hellos

Collection<String> greetings = Collections.singletion("Helllo");

Collection<String> greetings = Collections.emptySet();

List<Employee> list = staff.subList(10,20);

Restricted Views

Collections.unmodifiableCollection

Collections.unmodifiableList

Collections.unmodifiableSet

Collections.unmodifiableSortedSet

Collections.unmodifiableNaviagableSet

Collections.unmodifiableMap

* look but don't touch

* Synchronized views for safe concurrent access. But you should use a thread safe collection instead.

Practical

List<String> names = Arrays.asList("A", "B", "C");

In Java - 7

List<Integer> digits = [1,2,3,4,5,6];

Set<Integer> digits = {1,2,3,4,5,6};

In Java -9

List<Integer> digits = List.of(1,2,3,4,5,6);

Set<Integer> digits = Set.of(1,2,3,4,5,6);

Map<Integer, String> map = {4 : "ab", 5 : "bc", 6 : "ce"};

In Java -9

Version 1 :

Map<Integer, String> map = Map.of(4 , "ab", 5 , "bc", 6 , "ce");

Version -2 :

import static java.util.Map.*

map = ofEnteries( entry(4,"a") , entry(5,"b") , entry(6,"d") )

* Version 1 works only if you have less than 10 elements

Collection to Arrays
String[] names = collection.toArray( new String[collection.size()]);

References

Image : https://facingissuesonit.com/2019/10/15/java-collection-framework-hierarchy/

Book : Core Java 11 Fundamentals, Second Edition by Cay S. Horstmann

Sunday, May 24, 2020

Unit vs Integration Testing

* Developers should run unit tests and then commit the code. (best practice)

* One of the golden rules of unit testing is that your tests should cover code with “business logic”.

In this case, the highlighted part in gold is where you should focus your testing efforts. This is the part of the code where usually most bugs manifest. It is also the part that changes a lot as user requirements change since it is specific to your application.

* So what happens if you get across a legacy application with no unit tests? What if the “business logic” part ends up being thousands of lines of code? Where do you start?
In this case you should prioritize things a bit and just write tests for the following:
1. Core code that is accessed by a lot of other modules
2. Code that seems to gather a lot of bugs
3. Code that changes by multiple different developers (often to accommodate new requirements)
How much of the code in these areas should we test, you might ask. Well, now that we know which areas to focus on, we can now start to analyze just how much testing we need to feel confident about our code.

Reference :
https://zeroturnaround.com/author/kostis-kapelonis/

Run MySQL as docker container

1. docker pull mysql

2. docker run -p 3306:3306 --name mysqlimage -e MYSQL_ROOT_PASSWORD=abc123 -d mysql

In order to connect from MySQL workbench type one of the below IP
- localhost
- docker inspect CONTAINER_ID | grep "IPAddress"

Code review

Code Review

Best Article : https://www.processimpact.com/articles/humanizing_reviews.pdf

Detailed : https://medium.com/palantir/code-review-best-practices-19e02780015f

Stats : https://blog.codinghorror.com/code-reviews-just-do-it/?source=post_page

Book : https://www.amazon.com/exec/obidos/ASIN/0201734850/codihorr-20

Best Practices : https://github.com/palantir/gradle-baseline/tree/develop/docs

internally

Projects

Thursday, May 28, 2020

Java Multithreading

Tuesday, May 26, 2020

Java Exceptions

Java hashCode()

How does HashMap works in Java

Java Collections

Iterable Interface

Iterator Interface

Collection Interface

Concrete Collections

List Interface

Set Interface

Queue Interface

Maps

Views

Practical

Collection to Arrays
String[] names = collection.toArray( new String[collection.size()]);

References

Sunday, May 24, 2020

Unit vs Integration Testing

Run MySQL as docker container

Code review

Projects

Thursday, May 28, 2020

Tuesday, May 26, 2020

Iterable Interface

Iterator Interface

Collection Interface

Concrete Collections

List Interface

Set Interface

Queue Interface

Maps

Views

Practical

Collection to ArraysString[] names = collection.toArray( new String[collection.size()]);

References

Sunday, May 24, 2020

Subscribe

Collection to Arrays
String[] names = collection.toArray( new String[collection.size()]);