Get Stratosphere up and running in a few simple steps.
Stratosphere runs on all UNIX-like environments: Linux, Mac OS X, Cygwin. The only requirement is to have a working Java 6.x (or higher) installation.
Download the ready to run binary package. Choose the Stratosphere distribution that matches your Hadoop version. If you are unsure which version to choose or you just want to run locally, pick the package for Hadoop 1.2.
You are almost done.
$ cd ~/Downloads # Go to download directory
$ tar xzf stratosphere-*.tgz # Unpack the downloaded archive
$ cd stratosphere
$ bin/start-local.sh # Start Stratosphere
Check the JobManager's web frontend at http://localhost:8081 and make sure everything is up and running.
Run the Word Count example to see Stratosphere at work.
bash
$ wget -O hamlet.txt http://www.gutenberg.org/cache/epub/1787/pg1787.txt
Start the example program:
$ bin/stratosphere run \
--jarfile ./examples/stratosphere-java-examples-0.5.1-WordCount.jar \
--arguments file://`pwd`/hamlet.txt file://`pwd`/wordcount-result.txt
You will find a file called wordcount-result.txt in your current directory.
Running Stratosphere on a cluster is as easy as running it locally. Having passwordless SSH and the same directory structure on all your cluster nodes lets you use our scripts to control everything.
jobmanager.rpc.address
key in conf/stratosphere-conf.yaml
to its IP or hostname. Make sure that all nodes in your cluster have the same jobmanager.rpc.address
configured.conf/slaves
.You can now start the cluster at your master node with bin/start-cluster.sh
.
The following example illustrates the setup with three nodes (with IP addresses from 10.0.0.1 to 10.0.0.3 and hostnames master, worker1, worker2) and shows the contents of the configuration files, which need to be accessible at the same path on all machines:
/path/to/stratosphere/conf/
stratosphere-conf.yaml
jobmanager.rpc.address: 10.0.0.1
/path/to/stratosphere/
conf/slaves
10.0.0.2 10.0.0.3
You can easily deploy Stratosphere on your existing YARN cluster.
./bin/yarn-session.sh
. You can run the client with options -n 10 -tm 8192
to allocate 10 TaskManagers with 8GB of memory each.For more detailed instructions, check out the programming Guides and examples.
comments powered by Disqus