Spliterators in Java8

Nishirika | March 8, 2018 | java8 | No Comments

With JDK 8 came many new changes and additions. One of these additions is the Spliterator interface which provides more functionality than the Iterator or the ListIterator interfaces. It can be used to traverse elements in a Collection, an array, an IO channel or a generator function.

Before moving on to Spliterators, let’s understand some of the key concepts that have laid the foundation for Spliterators. These are the areas where Spliterators are generally required.

Table of Contents

Parallel Computing/Programming

With multi-core computer systems trending, the need for parallel programming came into existence. It is important for the developers these days to take full advantage of using such systems to significantly decrease the processing time and enhance performance. Parallel computing refers to the process of executing several tasks at the same time in parallel threads.

A good example would be to take a task and break it down into several smaller tasks, execute them in independent parallel units(processors) and then combine the results from these small tasks to get the final result. This can reduce the time taken to execute a certain task significantly.

Note – But it is important to carefully analyze the need for parallel programming as the division and coordination of tasks might add overhead which in turn would not be very beneficial.

With such a process in existence, Java improved itself to adapt to the changing times with the introduction of Fork/Join Framework in JDK7. To take the step further, Java introduced Spliterators in JDK 8, which we shall see in detail.

Collection Framework and Parallel Computing

The java.util package along with other functionalities like date, time, String tokenization, etc provides a very powerful subsystem known as the Collection Framework. It provides exceptional technology with its Classes and Interfaces for storing and managing objects.

With such a powerful tool for storage of Objects, Collection Framework provides the Iterator interface which offers a standardized way of accessing the elements of a Collection one at a time.

To add more to it and to integrate the idea of parallel iteration for Collection objects, JDK 8 defines the Spliterator interface in the java.util package. In the following sections, we shall see how a Spliterator is better than an iterator for iterating through Collection objects.

Now the difficulty which programmers face when implementing parallel computing in Collection Framework is that the Collections are not thread-safe which means that using multiple threads on a Collection can lead to problems such as memory inconsistency and thread interference.

To get around this issue, the Collection Framework provides synchronization wrappers that automatically synchronizes the Collection making it thread-safe. But this introduces thread contention. Now, to enable parallel computing with non-thread-safe collections we have to use parallel streams and aggregate operations on the condition that we do not modify the contents of the Collection.

Streams and Parallel Computing

We already discussed that parallel computing might add a little overhead to the process. But when it comes to the Stream library the process becomes much easier and reliable in case of certain operations. When the stream is to be run in parallel the Java runtime environment partitions it into several parallel substreams. These substreams are then processed on different cores and then the result is combined to produce the results.

Although not a data storage object, Streams can still use a Spliterator to access its elements in the same way as a Collection. Using Spliterators provides significant advantages when it comes to working with parallel streams.

Spliterator

Spliterator is a combination of the words split and iterator. It can easily split the data and process it or traverse through it. Spliterator is a generic interface that is declared as:

interface Spliterator<T>

Where T is the type of elements being iterated.

Package: java.util

Some Features of a Spliterator

It provides parallel iteration for portions of sequences(using trySplit()).
It supports parallel programming(using the Fork/Join framework in the Concurrency Utilities).
It is more streamlined as it combines the hasNext and the next operations into a single method.
It works on both collection API and Stream API classes but does not work on maps.
It can traverse elements individually(using tryAdvance()) and also in bulk using(forEachRemaining()).

Differences between Spliterator and an Iterator

Iterator works only on the Collection API whereas Spliterator works on both the Collection as well as the Stream API classes.
Spliterator was introduced in the JDK version 8 whereas Iterator was introduced in JDK version 2.
Spliterator supports both parallel and sequential processing of data whereas Iterator supports only sequential processing of data.
The Iterator can be used as a Universal iterator whereas Spliterator cannot be.

Methods declared by the Spliterator Interface

long estimateSize() –
Returns the estimate of the number of elements that are yet to be iterated. Returns Long.MAX_VALUE if the number cannot be obtained.
int characteristics() –
Returns the characteristics of the invoking Spliterator, encoded into an integer. Each Spliterator has a set of attributes that are defined by the static int fields like SORTED, DISTINCT, SIZED, IMMUTABLE, CONCURRENT, NONNULL, SUBSIZED and ORDERED.
default boolean hasCharacteristics(int val) –
Returns true if the invoking Spliterator has the characteristics specified in val; otherwise false.
default void forEachRemaining(Consumer<? super T>action) –
Applies action to each unprocessed element in the data source.
boolean tryAdvance(Consumer<? super T>action) –
Applies the action to the next element in the data source. Returns true if next data is present; otherwise false.
Spliterator<T> trySplit() –
If possible splits the invoking Spliterator and returns a reference to the new Spliterator for the partition. The original Spliterator iterators over one portion of data and the new Spliterator iterator over the other one. It returns null if the data cannot be split.
default Comparator<? super T> getComparator() –
Returns the comparator used by the invoking Spliterator and returns null if natural ordering is used. Throws IllegalStateException if the sequence is unordered.
default long getExactSizeIfKnown() –
Returns the number of elements left to iterate from a sized Spliterator; otherwise returns -1.

NOTE: Consumer is a generic functional interface declared in java.util.function package which applies an action to an object. The easiest way to implement Consumer is with the use of Lambda Expressions.

Nested SubInterfaces of Spliterator

Spliterator has specialized subinterfaces for the primitive types long, int and double. These are Spliterator.ofLong, Spliterator.ofInt, and Spliterator.ofDouble. It also has a generalized version for the primitive data types, i.e. Spliterator.ofPrimitive() which acts as a super interface for the previously mentioned primitive subinterfaces and provides additional flexibility.

Program to demonstrate the use of Spliterator in the Collection API as well as the Stream API

Unable to retrieve the Code part.
Please reload again.
Notify us if the problem still persists.
Till we work on this you can view code on URL below.
Please visit - https://github.com/HiteshGarg/codingeek/blob/master//Java8/Spliterators/SpliteratorExample.java

Output:-
Spliterator 2 result: 
Apple
Mango
Spliterator 1 result: 
Banana
Pear
Grapes

Learning is the whetstone for great minds.So, do come back for more. Hope this helps and you like the tutorial. Do ask for any queries in the comment box and provide your valuable feedback. Share and subscribe.
Keep Coding!! Happy Coding!! ?

Codingeek