DZone

The Apache Parquet format is a compressed, efficient columnar data representation. The existing Parquet Java libraries available were developed for and within the Hadoop ecosystem. Hence there tends to a be near automatic assumption that one is working with the Hadoop distributed filesystem, HDFS.

There are situations that one might want to create Parquet-formatted data to a regular file system file – particularly if not working in a context that assumes Hadoop and HDFS are present. Some big data tools and runtime stacks, which do not assume Hadoop, can work directly with Parquet files.

Source: DZone