Impala

1. Creating a partitioned table in  parquet format directly from an unpartitioned table in different format.

create external table dbname.partitioned_table
PARTITIONED BY ( col5,col6 )
STORED AS PARQUET
LOCATION 'hdfs://nameservice/path/partitioned_table'
AS select
  col1 ,
  col2 ,
  col3,
  col4
  col5,
  col6
from  dbname.unpartitioned_table

2.   Error while accessing HIVE TEXTFILE format table in Impala.

Create a  TEXTFILE format table in HIVE. 

CREATE EXTERNAL TABLE test_impala_do (   category STRING,   segment STRING,   level_1 STRING,   group_code INT,   d_xuid_count INT,   segment_code INT,   datetime STRING COMMENT 'DATETIME',   source_file STRING COMMENT 'SOURCE_FILE' ) PARTITIONED BY (   akey STRING COMMENT 'AKEY AS DEFINED' ) WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS TEXTFILE LOCATION 'hdfs://nameservice1/..../test_impala_do'

Populated using spark SQL job.

INSERT OVERWRITE INTO test_impala_do.....

if before populating the data I execute following.

sqlContext.setConf("hive.exec.compress.output","true")

then when I access the table in Impala , it gives exception ...could not load table metadata complaining that files should be compressed with some predefined extensions

If I populate without this conf , then it works fine.




Comments

Popular posts from this blog

SQL

Analytics

HIVE