Azure HDInsight
You can quickly launch Azure HDInsight cluster using following link.
https://azure.microsoft.com/en-in/resources/templates/101-hdinsight-spark-linux/
A few things I learned while using above link. First thing is it lands you on a page where you need to enter cluster name, cluster admin name and password , sshuser name and password. When I entered required info and clicked "Purchase" , it failed. When I looked into the log, I found that issue was related to password. The message coming back is not always correct. It says the password should be between 6-xx characters. In reality it needs the password to be at least 10 characters with at least one upper case , on small case and 1 number. I had to spend 15-20 minutes to figure this out and get it working.
Once you launch the cluster , you can access the Ambari UI at <clustername>.azurehdinsight.net. You can access spark cluster via commands like <clustername>.azurehdinsight.net/livy/sessions or <clustername>.azurehdinsight.net/livy/batches. For more information see following link.
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-livy-rest-interface
When you launch cluster , it also provides you ssh access at <sshuser>@<clustername>-ssh.azurehdinsight.net . You can login there and upload files, spark code etc. etc. You can use hadoop commands here. For example you can upload "demo.jar" and then put that into hdfs location as following.
"wasb://<clustername>@<storageaccountname>.blob.core.windows.net/example/data/codebase/jar/demo.jar"
https://azure.microsoft.com/en-in/resources/templates/101-hdinsight-spark-linux/
A few things I learned while using above link. First thing is it lands you on a page where you need to enter cluster name, cluster admin name and password , sshuser name and password. When I entered required info and clicked "Purchase" , it failed. When I looked into the log, I found that issue was related to password. The message coming back is not always correct. It says the password should be between 6-xx characters. In reality it needs the password to be at least 10 characters with at least one upper case , on small case and 1 number. I had to spend 15-20 minutes to figure this out and get it working.
Once you launch the cluster , you can access the Ambari UI at <clustername>.azurehdinsight.net. You can access spark cluster via commands like <clustername>.azurehdinsight.net/livy/sessions or <clustername>.azurehdinsight.net/livy/batches. For more information see following link.
https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-livy-rest-interface
When you launch cluster , it also provides you ssh access at <sshuser>@<clustername>-ssh.azurehdinsight.net . You can login there and upload files, spark code etc. etc. You can use hadoop commands here. For example you can upload "demo.jar" and then put that into hdfs location as following.
hadoop fs -put -f demo.jar /example/data/codebase/jar
Now you can access this file using following URL in livy comamnds.
Comments
Post a Comment