Following from my previous post on Installing the Amazon EMR Command Line Interface for Windows, I will look at how to create an Elastic MapReduce job (Hive Cluster) from the CLI. This includes creation of Instances i.e a Master Instance and Slave Instances using an Interactive Hive Session.
Step 1: without specifying the number of instances to create
- Run the following command from the Amazon EMR CLI directory:
–create –alive –hive-interactive –name “Test_Job Flow” –instance-type m1.small –hive-interactive
- The instance should now be up and running
- As seen this created just one instance which will be the Master instance
Step 2: Let’s now add the number of instances switch and see the result
–create –alive –hive-interactive –name “Test_Job Flow” –instance-type m1.small –num-instances 2 –hive-interactive
- The number of instances is now 2
- Viewing the Instance Groups, we have one acting as MASTER and second as CORE
This is also same in a 3 instance scenario, you will also have 1 MASTER and the rest will be the CORE instances (2)