Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Introduction to Mesos: Set up a basic 2-node cluster (Master & Slave)

Notifications You must be signed in to change notification settings

mswimmer/mesos-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

49 Commits

Repository files navigation

mesos-tutorial

Introduction to Mesos

Prelude

This tutorial is meant to explain how to create a remote CentOS7 Mesos Master node & a remote CentOS7 Mesos Slave node, including how to configure and test the services (Mesos, ZooKeeper, Marathon), and test the cluster.

Table of Contents

Create 2 CentOS7 VMs

Cloning VM from CentOS 7 Vagrant template

I used:

  • Mesos Master node: 10.145.6.64 / d1p3920-charles-mesos-master.vchslabs.vmware.com
  • Mesos Slave node: 10.145.6.68 / d1p3920-charles-mesos-slave.vchslabs.vmware.com

To create the 2 VMs, I cloned an internal CentOS7 Vagrant template already uploaded on the vCenter

Set up network configuration

Check out this gist

Turning the 2 blank VMs into a Mesos Master & Mesos Slave

Check out the official Mesos tutorial and follow the "RedHat 7 / CentOS 7" instructions

Install Mesos Master Node

Install packages

Add yum repo

$ rpm -Uvh http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm

Install ZooKeeper

$ yum -y install mesosphere-zookeeper

Install Mesos & Marathon

$ yum -y install mesos marathon

ZooKeeper

Configure ZooKeeper

Set the ID /var/lib/zookeeper/myid with an unique integer between 1 and 255 on each node

$ echo "1" > /var/lib/zookeeper/myid

ZooKeeper list of server addresses

$ ifconfig eth0 # get IP address (interface eth0)
$ echo "server.1=MESOS_MASTER_IP:2888:3888" >> /etc/zookeeper/conf/zoo.cfg

Start ZooKeeper service

$ systemctl start zookeeper

Test ZooKeeper service

$ ps -aux | grep zookeeper
root 1138 java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp /opt/mesosphere/zookeeper/bin/...
$ systemctl status zookeeper
zookeeper.service - Apache ZooKeeper
 Loaded: loaded (/usr/lib/systemd/system/zookeeper.service; enabled)
 Active: active (running) since Sun 2015年05月31日 18:26:57 PDT; 18min ago
 Main PID: 1138 (java)
 CGroup: /system.slice/zookeeper.service
 └─1138 java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp /opt/mesosphere/zookeeper/bin/../build/classes:/opt/mesosphere/zookeeper/bin/../build/lib/*.jar:/opt/m...
May 31 18:32:51 d1p3920-charles-mesos-master.vchslabs.vmware.com zookeeper[1138]: at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)

Make sure ZooKeeper service listens on the port 2181

$ ss -ln |grep 2181
tcp LISTEN 0 50 :::2181 :::*

Mesos

Extra config for Mesos

Mesos Master IP

$ ip addr show eth0 | grep 'inet '
inet 10.145.6.64 brd 10.145.6.255 ...
$ echo "10.145.6.64" > /etc/mesos-master/ip

Mesos Master Hostname

$ nslookup 10.145.6.64
Server:		10.132.71.1
Address:	10.132.71.1#53
64.6.145.10.in-addr.arpa	name = d1p3920-charles-mesos-master.vchslabs.vmware.com.
$ echo "d1p3920-charles-mesos-master.vchslabs.vmware.com" > /etc/mesos-master/hostname

Cluster name

echo "charles-cluster" > /etc/mesos-master/cluster

ZooKeeper list of Master's IP

echo "zk://MESOS_MASTER_IP:2181/mesos" > /etc/mesos/zk

Quorum

/etc/mesos-master/quorum should remain 1

Disable mesos-slave service

$ systemctl stop mesos-slave
$ systemctl disable mesos-slave
rm '/etc/systemd/system/multi-user.target.wants/mesos-slave.service'

Start Mesos Master

Restart the Mesos Master service

$ systemctl restart mesos-master

Test Mesos Master service

$ ps -aux | grep mesos-master
root 2395 /usr/sbin/mesos-master --zk=zk://10.145.6.64:2181/mesos --port=5050 --log_dir=/var/log/mesos --cluster=charles-cluster --hostname=d1p3920-charles-mesos-master.vchslabs.vmware.com. --ip=10.145.6.64 --quorum=1 --work_dir=/var/lib/mesos
$ systemctl status mesos-master
 Loaded: loaded (/usr/lib/systemd/system/mesos-master.service; enabled)
 Active: active (running) since Sun 2015年05月31日 18:32:34 PDT; 24min ago
 Main PID: 2395 (mesos-master)
 CGroup: /system.slice/mesos-master.service
 ├─2395 /usr/sbin/mesos-master --zk=zk://10.145.6.64:2181/mesos --port=5050 --log_dir=/var/log/mesos --cluster=charles-cluster --hostname=d1p3920-charles-mesos-master.vchslabs.vm...
 ├─2411 logger -p user.info -t mesos-master[2395]
 └─2412 logger -p user.err -t mesos-master[2395]
May 31 18:57:10 d1p3920-charles-mesos-master.vchslabs.vmware.com mesos-master[2412]: I0531 18:57:10.816542 2420 master.cpp:2273] Processing ACCEPT call for offers: [ 20150531-183234-10741...

Marathon

Start Marathon service

$ systemctl restart marathon.service

Test Marathon service

$ ps -aux | grep marathon
root java -Djava.library.path=/usr/local/lib:/usr/lib:/usr/lib64 -Djava.util.logging.SimpleFormatter.format=%2$s%5$s%6$s%n -Xmx512m -cp /usr/bin/marathon mesosphere.marathon.Main --zk zk://10.145.6.64:2181/marathon --master zk://10.145.6.64:2181/mesos
$ systemctl status marathon
marathon.service - Marathon
 Loaded: loaded (/usr/lib/systemd/system/marathon.service; enabled)
 Active: active (running) since Sun 2015年05月31日 18:26:57 PDT; 32min ago
 Main PID: 1140 (java)
 CGroup: /system.slice/marathon.service
 ├─1140 java -Djava.library.path=/usr/local/lib:/usr/lib:/usr/lib64 -Djava.util.logging.SimpleFormatter.format=%2$s%5$s%6$s%n -Xmx512m -cp /usr/bin/marathon mesosphere.marathon.M...
 ├─1199 logger -p user.info -t marathon[1140]
 └─1200 logger -p user.notice -t marathon[1140]
May 31 18:58:53 d1p3920-charles-mesos-master.vchslabs.vmware.com marathon[1199]: [2015年05月31日 18:58:53,360] INFO 10.113.229.247 - - [01/Jun/2015:01:58:53 +0000] "GET /v2/apps//hello-marat...

Test the Mesos Master node

Mesos Master console - port 5050

There is no Mesos Slave node registered so far... mesos master console

Mesos Master Marathon console - port 8080

mesos master marathon console

Install Mesos Slave node

Install packages

Add yum repo

$ rpm -Uvh http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm

Install Mesos & telnet (for testing ports)

$ yum -y install mesos telnet

Test network connection with Mesos Master node

Ping the VM at its IP address

$ ping 10.145.6.64
PING 10.145.6.64 (10.145.6.64) 56(84) bytes of data.
64 bytes from 10.145.6.64: icmp_seq=1 ttl=64 time=0.979 ms
64 bytes from 10.145.6.64: icmp_seq=2 ttl=64 time=0.456 ms
....

Test if the port 2181, used by ZooKeeper on the Mesos Master node, is open

$ telnet 10.145.6.64 2181 
Trying 10.145.6.64...
Connected to 10.145.6.64.
Escape character is '^]'.
;
Connection closed by foreign host.

Mesos

Extra config for Mesos

Mesos Slave IP

$ ip addr show eth0 | grep 'inet '
inet 10.145.6.68 brd 10.145.6.255 ...
$ echo "10.145.6.68" > /etc/mesos-slave/ip

Mesos Slave Hostname

$ nslookup 10.145.6.68
Server:		10.132.71.1
Address:	10.132.71.1#53
68.6.145.10.in-addr.arpa	name = d1p3920-charles-mesos-slave.vchslabs.vmware.com.
$ echo "d1p3920-charles-mesos-slave.vchslabs.vmware.com" > /etc/mesos-slave/hostname

ZooKeeper list of Master's IP

echo "zk://MESOS_MASTER_IP:2181/mesos" > /etc/mesos/zk

Disable mesos-master service

$ systemctl stop mesos-master
$ systemctl disable mesos-master
rm '/etc/systemd/system/multi-user.target.wants/mesos-master.service'

Start Mesos Slave service

Start Mesos Slave service

$ systemctl restart mesos-slave

Test Mesos Master service

$ ps -aux | grep mesos-slave
root /usr/sbin/mesos-slave --master=zk://10.145.6.64:2181/mesos --log_dir=/var/log/mesos --hostname=d1p3920-charles-mesos-slave.vchslabs.vmware.com. --ip=10.145.6.68
$ systemctl status mesos-slave.service
systemctl status mesos-slave.service
mesos-slave.service - Mesos Slave
 Loaded: loaded (/usr/lib/systemd/system/mesos-slave.service; enabled)
 Active: active (running) since Sun 2015年05月31日 18:34:06 PDT; 38min ago
 Main PID: 32373 (mesos-slave)
 CGroup: /system.slice/mesos-slave.service
 ├─32373 /usr/sbin/mesos-slave --master=zk://10.145.6.64:2181/mesos --log_dir=/var/log/mesos --hostname=d1p3920-charles-mesos-slave.vchslabs.vmware.com. --ip=10.145.6.68
 ├─32383 logger -p user.info -t mesos-slave[32373]
 └─32384 logger -p user.err -t mesos-slave[32373]
May 31 19:02:07 localhost.localdomain mesos-slave[32384]: I0531 19:02:07.241608 32385 slave.cpp:3648] Current disk usage 6.79%. Max allowed age: 5.824677064796389days

Test the Mesos Slave node

A slave node is registered into the Mesos Master
mesos slave0

The 1rst node appears in the list of registered nodes on the master
mesos slave1

Here is the summary of the 1rst Mesos Slave node
mesos slave2

Test the Cluster

Connection Mesos Slave -> Mesos Master

$ mesos-resolve `cat /etc/mesos/zk`
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@716: Client environment:host.name=localhost.localdomain
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-123.el7.x86_64
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Mon Jun 30 12:09:22 UTC 2014
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@733: Client environment:user.name=root
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@741: Client environment:user.home=/root
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@log_env@753: Client environment:user.dir=/etc/mesos-slave
2015年05月31日 19:14:28,941:32455(0x7f0c8a303700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.145.6.64:2181 sessionTimeout=10000 watcher=0x7f0c9155f1e0 sessionId=0 sessionPasswd=<null> context=0x7f0c74001160 flags=0
2015年05月31日 19:14:28,942:32455(0x7f0c858ee700):ZOO_INFO@check_events@1703: initiated connection to server [10.145.6.64:2181]
2015年05月31日 19:14:28,956:32455(0x7f0c858ee700):ZOO_INFO@check_events@1750: session establishment complete on server [10.145.6.64:2181], sessionId=0x14dacbaa0b80013, negotiated timeout=10000
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0531 19:14:28.956903 32463 group.cpp:313] Group process (group(1)@127.0.0.1:49647) connected to ZooKeeper
I0531 19:14:28.956985 32463 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I0531 19:14:28.957010 32463 group.cpp:385] Trying to create path '/mesos' in ZooKeeper
I0531 19:14:28.959980 32463 detector.cpp:138] Detected a new leader: (id='9')
I0531 19:14:28.960134 32463 group.cpp:659] Trying to get '/mesos/info_0000000009' in ZooKeeper
I0531 19:14:28.961226 32463 detector.cpp:452] A new leading master (UPID=master@10.145.6.64:5050) is detected
10.145.6.64:5050

Launch task from Mesos Slave node

Best way to test: launch a task through mesos-execute from mesos-slave node

Set the MASTER

$ export MASTER=$(mesos-resolve `cat /etc/mesos/zk`)
$ echo $MASTER
10.145.6.64:5050

Launch the task

$ mesos-execute --master=$MASTER --name="cluster-test" --command="sleep 5"
I0531 19:15:50.273203 32492 sched.cpp:157] Version: 0.22.1
I0531 19:15:50.277058 32497 sched.cpp:254] New master detected at master@10.145.6.64:5050
I0531 19:15:50.277282 32497 sched.cpp:264] No credentials provided. Attempting to register without authentication
I0531 19:15:50.279747 32497 sched.cpp:448] Framework registered with 20150531-183234-1074172170-5050-2395-0003
Framework registered with 20150531-183234-1074172170-5050-2395-0003
task cluster-test submitted to slave 20150531-183234-1074172170-5050-2395-S0
Received status update TASK_RUNNING for task cluster-test
Received status update TASK_FINISHED for task cluster-test
I0531 19:15:55.405704 32496 sched.cpp:1589] Asked to stop the driver
I0531 19:15:55.405743 32496 sched.cpp:831] Stopping framework '20150531-183234-1074172170-5050-2395-0003'

Result in the Master Slave console

Under Slaves / Completed Frameworks, the list Completed Executors executed list Executor details showing tasks tasks

Check Mesos service logs

$ journalctl -u mesos-slave

Recurrent Problems

LIBPROCESS_IP not defined for the Mesos Slave node

Problem

$ mesos-execute --master=$MASTER --name="cluster-test" --command="sleep 5"
**************************************************
Scheduler driver bound to loopback interface! Cannot communicate with remote master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a routable IP address.
**************************************************

Solution: set LIBPROCESS_IP as an env variable

$ export LIBPROCESS_IP=10.145.6.68

Issues coming from previous mesos runs

Clear the cache saved from prior run

$ systemctl stop mesos-slave
$ rm -f /tmp/mesos/meta/slaves/latest
$ systemctl start mesos-slave
$ systemctl status mesos-slave

iptables

The iptables on a CentOS 7 VM should look like this

$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination

About

Introduction to Mesos: Set up a basic 2-node cluster (Master & Slave)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

AltStyle によって変換されたページ (->オリジナル) /