Parquet Wins

Hadoop supports multiple storage formats like CSV, Avro ( binary) , Parquet etc.  If performance is the only criteria then in most scenarios PARQUET wins.  See the nice blog below.

http://blog.cloudera.com/blog/2016/04/benchmarking-apache-parquet-the-allstate-experience/

There is another  format  similar to parquet called ORC and promoted by Hortonworks . Parquet was created and promoted by Cloudera.



Comments

Popular posts from this blog

SQL

Analytics

HIVE