DZone

Apache Spark is an innovation in data science and big data. Spark was first developed at the University of California Berkeley and later donated to the Apache Software Foundation, which has maintained it since. Spark is 100x faster than similar big-data technologies like Hadoop.

Apache Spark was not designed just for  data engineers. The majority of people who use Apache Spark are developers. But, there is a problem – if you search the internet, most of the resources are based on the Scala and Spark, so you may think that Spark APIs are designed just for Scala. In fact, Spark has great APIs and integration for Java that makes Java stronger for work with big data. In this article, I will try to explain a little bit about Spark and then dive in to the usage of Apache Spark in Java.

Source: DZone