Install Ansible on CentOS7.2 and thoughts about Continuous Delivery

Install Ansible

Reminder, Ansible is agentless-ssh. It’s a tool for doing the Continuous Delivery. It communicates with host by exchanging keys, so you need to authorize hosts to be controlled by ansible. See here for how to do this : https://java8fx.wordpress.com/2017/07/01/ansible-mysql5-7-for-centos7-2/

Quick install

Install extra package for CentOs-like linux before, and Ansible after.

sudo yum install epel-release
sudo yum install ansible

How Ansible work

I spent few hours to manipulate Ansible. Below in this article you’ll find basic concepts behind terms and vocabulary. You can also find some examples on the Github : https://github.com/ansible/ansible-examples . And a quick presentation, by the unavoidable wiki, here : https://en.wikipedia.org/wiki/Ansible_(software)

Playbook : is a yaml and/or tree folders file structure, containing all Ansible declaration and notably tasks to execute against hosts. The purpose to reach, being to use/reuse “tasks” in a modular manner. Role helps in that way.

Example of a simple playbook, reusing two roles
 ---
# This playbook deploys a simple standalone Tomcat 7 server.
# list of servers targeted
- hosts: tomcat-servers
 remote_user: root
 become: yes
 become_method: sudo
# tasks are not written in the playbook but in the two roles selinux, tomcat
roles:
 - selinux
 - tomcat

Tasks  : a simple command coming from Modules directly embed in a playbook, or a yaml file belonging to a role.

Example, install with apt-get git and Apache servers, note the loop on items :
 - name Install services
   apt: pkg{{item}} state=installed
   with_items: 
    - git
    - apache2

Modules : bring us the capability to execute tasks on hosts.

Example : module crypto, network, notification, ... above it's the module apt in the task

Inventories (hosts) :  static or dynamic list of targeted systems to be manipulated by Ansible. Could be declared on a hosts file (Ansible not linux), or in file somewhere in the tree file/folder structures.

Example: 
[production]
10.0.10.1   prod1.milano.org   
10.0.10.2   prod2.milano.org

Roles : role is a major concept in Ansible, see it as a self-repository of fine granular tasks you use to assemble playbook. The roles repository could be global, local to user or project, or in the galaxy repo as well.

Example of roles : install netty, block all incoming trafic, ...
You can create role structure, with this script :

$ ansible-galaxy init acme --force
This creates the directory structure needed for organizing your code:
acme/
 .travis.yml
 README.md
 defaults/
 files/
 handlers/
 meta/
 tasks/
 templates/
 tests/
 vars/

Tags : allow to mark some task or role with specific label. For building modular deliveries, this concept is essential, only executing tasks marked with tags.

Example:
  - name: Copy file configuration
    copy:
     src: "app.yml"
     dest: "/home/user/app.yml"
    tags:
    - configure

Template : jinja2 j2 file, very useful, you could rewrite resource, conf, yaml files, … everything.

Example:
 # {{ ansible_managed }}
 # Manual customization of this file is not recommended.
 *filter
 :INPUT ACCEPT [0:0]
 :FORWARD ACCEPT [0:0]
 :OUTPUT ACCEPT [0:0]

{% if (inventory_hostname in groups['lb_servers']) %}
 -A INPUT -p tcp --dport {{ nginx_http_port }} -j ACCEPT
 {% endif %}

-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
 -A INPUT -p icmp -j ACCEPT
 -A INPUT -i lo -j ACCEPT
 -A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
 COMMIT

Thoughts about Ansible and Continuous Delivery in general

DevOps

Firstly, this is my vision of DevOps in my Project/Product development industry.

Ops :

Provision : scale up/down machines

Example: add frontend servers , add security rules, ssh keys, users, …

Even we try globally to be the more “agnostic” as possible, the granularity is linked to the machine provider and the purpose of the machine. Docker helps to keep things more simple for anonymizing and uniforming resources. So we would be able to provision a CentOS with 4cpu/16GB , 32BG os SSD for the OS and 1TB of disk for the rest. That’s it. The Configure step will do the rest.

Configure : add/remove capabilities to machines

Example: container or not, it’s install tomcat, apache, mysql node… whatever kind of standardized server it is, it must be in a repository somewhere in the world … it’s also add/remove rules allowing services working properly.

I divide Configure step in 3 categories : frontend servers, backend and persistence servers. Every configure options fit in these 3 categories.

Now Dev

Deploy : add/remove/configure project

Example: apply an update on schema, deploy a new version of GUIs, apply quality environments tests, checks, reporting, …

Close to Configure, this step is related to a project, that means it’s highly tight to developers team, and our business agenda. As usual for all custom development tests, checks are to be done regularly. The repository is in-house. Mocking services are common practice, .. also sort of instability as well.

Release : release the product.

Example: get all master github branches from repository, stop all servers one-by-one, install new release, start servers one-by-one, inform users before, after successful release …

It’s mainly a step for qualification and production environments, dedicated for final user which focuses on Product and not on project. This step is not often automatic, and occurs after all projects Deployments are successfully done, it’s highly a phase of orchestration and consolidation, that brings all Deploy projects in qualification or production.

What about Ansible and DevOps

According to me Ansible misses high level abstraction of DevOps, also we should have execution of playbooks as Maven does for projects through pom.xml.

I expected something like : ansible-playbook provision|configure|deploy|release product.yml

Dev and Ops are separated with Ansible. The purpose of Ansible is to modify hundred of hosts with module/task commands. It’s batching process, in the opposite Dev needs orchestration, fine granularity when commanding hosts.

Ansible Pro’s & Con’s

Good :

  • role and tags
  • yaml file for configuring
  • json for result of commands (better integration)
  • template engine
  • parent playbook, child (role)
  • Convention Over Configuration oriented
  • unique yaml file or  multiple tree folders/files structure for playbook
  • happy to discover notions of handler, pre-task, post-task, require, include

Bad :

  • no rollback management
  • no orchestration at all
  • compose and reuse Ansible objects are not as straight forward
  • no central repository for playbooks with versioning and dependencies
    • note I just discovered ansible-galaxy, roles sharing site
  • no local repository (similar to maven repo)
  • no lifecycle !
  • playbooks difficult to manage, lot of files and folders !

Final word. Even I’m still searching a better way to compose playbooks and orchestrate the release of my Products. Ansible is the better choice by now to delegate the Continuous Delivery in collaboration with Bamboo, Artifactory, Maven and Bitbucket. It forces us to standardized our Ops and Dev.

To be continued.

References :

About modular roles : https://opencredo.com/reusing-ansible-roles-with-private-git-repos-and-dependencies/

#ansible, #centos

Ansible – MySql5.7 for CentOS7.2

This is a quick playbook for adding MySql5.7 ready to go

This is script is based on and adapted from http://pierrepironin.fr/ansible-mysql57/

This script assumes that :

  • your private key is in a standard path, if you need to create keys, see at the end
  • ansible is installed correctly
  • /etc/ansible/hosts contains <YOUR_HOST_FQDN>
    extract from my personal file , you should have a line like this below

    # application
    [mysql57-centos72]
    <YOUR_HOST_FQDN> ansible_connection=ssh
# ansible-mysql57-centos72.yml:

- name: ansible-java8-centos72
hosts: <YOUR_HOST_FQDN>
remote_user: <YOUR_USER>
become: true
become_method: sudo
## Variables ##
vars:
ansible_ssh_private_key_file: ~/.ssh/id_rsa
mysql_root_password: <YOUR_MYSQL_ROOT_PASS>
## Tasks ##
tasks:
- name: Get YUM repository for MySQL
get_url:
url: http://dev.mysql.com/get/mysql57-community-release-el7-7.noarch.rpm
dest: /tmp/mysql57-community-release-el7-7.noarch.rpm
- name: Install YUM repository for MySQL
shell: /bin/rpm -Uvh /tmp/mysql57-community-release-el7-7.noarch.rpm
register: yum_repo_return
failed_when: "'conflict' in yum_repo_return.stderr"
- name: Install MySQL community server
yum:
name: mysql-community-server
state: present
- name: Launch MySQL service
service:
name: mysqld
state: started
enabled: yes
- name: Install required python MySQLdb lib to create databases and users
yum:
name: "{{item}}"
state: present
with_items:
- gcc-c++
- MySQL-python
- name: Get temporary MySQL root password
shell: grep 'temporary password' /var/log/mysqld.log | awk '{print $NF}'
register: mysql_root_temp_password
- name: Set the MySQL root password
shell: mysqladmin -u root --password="{{ mysql_root_temp_password.stdout }}" password "{{ mysql_root_password }}"
register: mysql_admin_root_password_result
failed_when: "'(using password: NO)' in mysql_admin_root_password_result.stderr"
- name: Tune MySQL configuration
template:
src: ./resources/my.cnf
dest: /etc/my.cnf
mode: 0644
notify:
- restart mysqld
- name: Create my datatable
mysql_db:
login_user: root
login_password: "{{ mysql_root_password }}"
name: MY_DATATABLE
encoding: utf8
collation: utf8_bin
- name: Create MY_DBA user in MySQL and grant privileges
mysql_user:
login_user: root
login_password: "{{ mysql_root_password }}"
user: MY_DBA
password: "{{ mysql_root_password }}"
host: '%'
priv: 'MY_DATATABLE.*:ALL'
## Handlers ##
handlers:
- name: restart mysqld
service:
name: mysqld
state: restarted

Create keys for your user

Go to your ansible machine in root

Generate key for root (I take root)

ssh-keygen

Keys are generated in /root/.ssh/

Put pub key (copy paste content of id_rsa.pub) on remote file ~<YOUR_USER>/.ssh/authorized_keys

Launch agent

ssh agent add private key
ssh-add ~/.ssh/id_rsa

Start agent

eval $(ssh-agent -s)

#centos, #mysql

Ansible – java8 for CentOS 7.2

This is a quick playbook for adding Java8 ready to go

This script assumes that :

  • your private key is in a standard path, if you need to create keys, see at the end
  • ansible is installed correctly
  • /etc/ansible/hosts contains <YOUR_HOST_FQDN>
    extract from my personal file , you should have a line like this below

    # application
    [java8-centos72]
    <YOUR_HOST_FQDN> ansible_connection=ssh
# ansible-java8-centos72.yml:
# For Linux RH/Centos72
- name: java8-centos72
 hosts: <YOUR_HOST_FQDN>
 remote_user: <YOUR_USER>
 become: true
 become_method: sudo
 vars:
 ansible_ssh_private_key_file: ~/.ssh/id_rsa
 download_url: http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz
 download_folder: /opt
 java_version_path: jdk1.8.0_131
 java_version_path_short: jdk-8u131
 java_name: "{{download_folder}}/{{java_version_path}}"
 java_archive: "{{download_folder}}/{{java_version_path_short}}-linux-x64.tar.gz"

tasks:
 - name: Download Java
 command: "wget -q -O {{java_archive}} --no-check-certificate --no-cookies --header 'Cookie: oraclelicense=accept-securebackup-cookie' {{download_url}} creates={{java_archive}}"

- name: Unpack archive
 command: "tar -zxf {{java_archive}} -C {{download_folder}} creates={{java_name}}"

- name: Fix ownership
 file: "state=directory path={{java_name}} owner=root group=root recurse=yes"

- name: Make Java available for system
 command: 'alternatives --install "/usr/bin/java" "java" "{{java_name}}/bin/java" 2000'

- name: Clean up
 file: "state=absent path={{java_archive}}"

Create keys for your user

Go to your ansible machine in root

Generate key for root (I take root)

ssh-keygen

Keys are generated in /root/.ssh/

Put pub key (copy paste content of id_rsa.pub) on remote file ~<YOUR_USER>/.ssh/authorized_keys

Launch agent

ssh agent add private key
ssh-add ~/.ssh/id_rsa

Start agent

eval $(ssh-agent -s)

#ansible, #centos, #java

Last Java8 u131 on Centos 7.2

Go directly :

sudo wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz
sudo tar xzf jdk-8u131-linux-x64.tar.gz
sudo rm -f jdk-8u131-linux-x64.tar.gz

cd /opt/jdk1.8.0_131/
sudo alternatives --install /usr/bin/java java /opt/jdk1.8.0_131/bin/java 2
sudo alternatives --config java

sudo alternatives --install /usr/bin/jar jar /opt/jdk1.8.0_131/bin/jar 2
sudo alternatives --install /usr/bin/javac javac /opt/jdk1.8.0_131/bin/javac 2
sudo alternatives --set jar /opt/jdk1.8.0_131/bin/jar
sudo alternatives --set javac /opt/jdk1.8.0_131/bin/javac

export JAVA_HOME=/opt/jdk1.8.0_131
export JRE_HOME=/opt/jdk1.8.0_131/jre
export PATH=$PATH:/opt/jdk1.8.0_131/bin:/opt/jdk1.8.0_131/jre/bin

#centos, #java

AMP on CentOS 7.2

It’s all about php 5.6.30, Apache2, MariaDB 10.0.30.

You need to configure the repo like this : https://java8fx.wordpress.com/tag/mariadb/

Basically you will install Apache, MariaDB and Php :

sudo yum install httpd
sudo yum install MariaDB-server MariaDB-client
sudo systemctl start mariadb
sudo mysql_secure_installation
sudo systemctl enable mariadb.service
sudo yum install php php-mysql

Then upgrade Php :

# upgrade
sudo rpm -Uvh https://mirror.webtatic.com/yum/el7/epel-release.rpm
sudo rpm -Uvh https://mirror.webtatic.com/yum/el7/webtatic-release.rpm
sudo yum remove php-common
sudo yum install -y php56w php56w-opcache php56w-xml php56w-mcrypt php56w-gd php56w-devel php56w-mysql php56w-intl php56w-mbstring
sudo systemctl restart httpd.service

That’s it.

#centos, #mariadb, #php

Debian : some work to do on this distrib

I know a bit Redhat and CentOS, not Debian. I have just inherited a complex system under Debian, I need to rebuild this system on a new Debian, without having any information on what damned are already installed … Lot of fun expected.

Then, I asked to my favorite Virtual Machine reseller, to install and configure for me an old Debian 7.9, with a special driver for broadcast storage. So, I need to check whether version I’m running is correct or not.

Check Debian version

lsb_release -da
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 7.9 (wheezy)
Release:        7.9
Codename:       wheezy

That looks good.

Open JDK

So I need Open JDK 7u95-2.6.4-1~deb7u1. How to install Open JDK on Debian ???

apt-cache --names-only search openjdk
openjdk-6-dbg - Java runtime based on OpenJDK (debugging symbols)
openjdk-6-demo - Java runtime based on OpenJDK (demos and examples)
openjdk-6-doc - OpenJDK Development Kit (JDK) documentation
openjdk-6-jdk - OpenJDK Development Kit (JDK)
openjdk-6-jre - OpenJDK Java runtime, using Hotspot JIT
openjdk-6-jre-headless - OpenJDK Java runtime, using Hotspot JIT (headless)
openjdk-6-jre-lib - OpenJDK Java runtime (architecture independent libraries)
openjdk-6-jre-zero - Alternative JVM for OpenJDK, using Zero/Shark
openjdk-6-source - OpenJDK Development Kit (JDK) source files
openjdk-7-dbg - Java runtime based on OpenJDK (debugging symbols)
openjdk-7-demo - Java runtime based on OpenJDK (demos and examples)
openjdk-7-doc - OpenJDK Development Kit (JDK) documentation
openjdk-7-jdk - OpenJDK Development Kit (JDK)
openjdk-7-jre - OpenJDK Java runtime, using Hotspot JIT
openjdk-7-jre-headless - OpenJDK Java runtime, using Hotspot JIT (headless)
openjdk-7-jre-lib - OpenJDK Java runtime (architecture independent libraries)
openjdk-7-jre-zero - Alternative JVM for OpenJDK, using Zero/Shark
openjdk-7-source - OpenJDK Development Kit (JDK) source files
uwsgi-plugin-jvm-openjdk-6 - Java plugin for uWSGI (OpenJDK 6)
uwsgi-plugin-jwsgi-openjdk-6 - JWSGI plugin for uWSGI (OpenJDK 6)

Ok I got it

 apt-get install openjdk-7-jdk openjdk-7-doc openjdk-7-jre-lib 

After this operation, 440 MB of additional disk space will be used.
Do you want to continue [Y/n]? Y
Media change: please insert the disc labeled
 'Debian GNU/Linux 7 _Wheezy_ - Official Snapshot amd64 LIVE/INSTALL Binary 20150906-01:37'
in the drive '/media/cdrom/' and press enter

I don’t have any cdrom , let’s remove this source folder. Edit /etc/apt/sources.list and put in comment this kind of line below :

# deb cdrom:[Debian GNU/Linux 7 _Wheezy_ - Official Snapshot amd64 LIVE/INSTALL Binary 20150906-01:37]/ wheezy main

Ok I have an upgraded version of java u7, that should be ok :

java version "1.7.0_121"
OpenJDK Runtime Environment (IcedTea 2.6.8) (7u121-2.6.8-2~deb7u1)
OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)

SOLR 5.1

I need Solr 5.1 standalone. Search it before :

apt-cache --names-only search solr

Apache2

apt-get install apache2 apache2-mpm-prefork

Php 5

With a long list of modules.

apt-get install php5-common libapache2-mod-php5 php5-cli php-db php-gettext php-pear php5-curl php5-gd php5-ldap php5-mcrypt php5-mysql php5-snmp php5-xcache php5-xmlrpc

#debian

AutosuggestFX reborn After having spent…

AutosuggestFX reborn

After having spent days and days to learn and build component in JavaFX, I restart this project.

I hope the Help of IntelliJ to give me a free licence to develop this component.

https://www.jetbrains.com/

qBittorrent and OpenTracker on CentOS7.2

qBittorrent will be used in my company for synchronizing files upon multiple servers (around 160 servers). What is my purpose with p2p technology ? Maintain a common synchronized repository within 160 servers, servers could create .torrent , seed or leech content.

So, qBittorrent will be installed headless and used to seed and leech content. Open Tracker acts as .torrent server. For doing a torrent file I use mktorrent.See ref at the end.

We have 3 types of servers (R, S, N):

  • R servers will have a tracker and seeder capabilities
  • S servers will have tracker, seeder and leecher capabilities
  • N servers with seeder and leecher capabilities

Number of servers for each type : R=80 S=2 N=80

future_architecture_v1-0-0-pptx-powerpoint2

Manage files : qBittorent with no X (web access)

cd /tmp 
sudo yum -y groupinstall 'Development Tools'
sudo yum -y install qt-devel boost-devel openssl-devel qt5-qtbase-devel qt5-linguist

# lib torrent
sudo wget https://github.com/arvidn/libtorrent/releases/download/libtorrent-1_0_10/libtorrent-rasterbar-1.0.10.tar.gz
sudo tar -zxf libtorrent-rasterbar-1.0.10.tar.gz
cd libtorrent-rasterbar-1.0.10
sudo ./configure --prefix=/usr
sudo make
sudo make install
sudo ln -s /usr/lib/pkgconfig/libtorrent-rasterbar.pc /usr/lib64/pkgconfig/libtorrent-rasterbar.pc
sudo ln -s /usr/lib/libtorrent-rasterbar.so.8 /usr/lib64/libtorrent-rasterbar.so.8

Now the qBittorrent app

sudo git clone https://github.com/qbittorrent/qBittorrent.git
cd qBittorrent
sudo ./configure --prefix=/usr --disable-gui CPPFLAGS=-I/usr/include/qt5
sudo make
sudo make install

You need to install as service, type this script below with this name :

sudo vi /usr/lib/systemd/system/qbittorrent.service

    [Unit]
    Description=qbittorrent torrent server

    [Service]
    User=
    ExecStart=/usr/bin/qbittorrent-nox
    Restart=on-abort

    [Install]
    WantedBy=multi-user.target

Reload & Start the service

sudo systemctl daemon-reload
sudo systemctl start qbittorrent

The server is accessing by http://localhost:8080/
User: admin
Pass: adminadmin

Make a Torrent : mktorrent

I started with an existing torrent and try to create myself with mktorrent.

Example of torrent file taken from internet archive :
Page for the video : https://archive.org/details/MJP21_201309
The video is in standard mp4 : https://ia800406.us.archive.org/1/items/MJP21_201309/MJP21.mp4

And this is the decoded torrent (decode HERE)  :

              name: MJP21_201309
          filename: MJP21_201309_archive.torrent
           comment: This content hosted at the Internet Archive at https://archive.org/details/MJP21_201309
Files may have changed, which prevents torrents from downloading correctly or completely; please check for an updated torrent at https://archive.org/download/MJP21_201309/MJP21_201309_archive.torrent
Note: retrieval usually requires a client that supports webseeding (GetRight style).
Note: many Internet Archive torrents contain a 'pad file' directory. This directory and the files within it may be erased once retrieval completes.
Note: the file MJP21_201309_meta.xml contains metadata about this torrent's contents.
              date: 02.09.2016 05:13:12 AM (1472807592)
        created_by: ia_make_torrent
             files: (5)
                    1: MJP21.gif
                    2: MJP21.mp4
                    3: MJP21.ogv
                    4: MJP21_201309_meta.sqlite
                    5: MJP21_201309_meta.xml
              size: 3 0 (323126493)
          announce: http://bt1.archive.org:6969/announce
     announce_list: 
                    - http://bt1.archive.org:6969/announce
                    - http://bt2.archive.org:6969/announce
         info_hash: 89e7e093da6ef468a29d471aabce2f3355c5f40f

I get MJP21.mp4 file and create the torrent with mktorrent, but before get source code HERE, compile, install on /usr/bin :

tar -zxvf mktorrent-1.0.tar.gz
sudo make PREFIX=/usr install

Make the torrent with the video file got above and a tracker server I randomly took :

mktorrent -n MJP21 -a http://bttracker.debian.org:6969/announce MJP21.mp4
mktorrent 1.0 (c) 2007, 2009 Emil Renner Berthing
Hashing MJP21.mp4.
Writing metainfo file... done.
ls -ltr
-rw-rw-r--. 1 stephm stephm 209529758 Dec 29 10:42 MJP21.mp4
-rw-r--r--. 1 stephm stephm     16190 Dec 29 11:48 MJP21.torrent

Serve torrent: Open Tracker

cd /tmp
# for launching testing suite
sudo yum -y install nc
# open tracker
sudo yum -y install cvs
cvs -d :pserver:cvs@cvs.fefe.de:/cvs -z9 co libowfat
cd libowfat
make
cd ..
git clone git://erdgeist.org/opentracker
cd opentracker
make

Now we have the binary opentracker, we simply launch it :

./opentracker &

DANGER. That let everybody has an access to your tracker on port 6969. See doc for having more closed installation.

Note for Azure User

Don’t forget security inbound rules

prtscr-capture

Final Notices :

Don’t forget if you install apache httpd server and serve web mp4 seed to disable SELinux.

sudo setenforce 0

References :
qBitorrent https://github.com/qbittorrent/qBittorrent
mktorrent : https://sourceforge.net/projects/mktorrent/
Open Tracker : http://erdgeist.org/arts/software/opentracker/
Decode Torrent : https://www.tools4noobs.com/online_tools/torrent_decode/
Bittorent spec : https://wiki.theory.org/BitTorrentSpecification

 

Last Java8 u112 quickly on CentOS like

This is a quick installation guide of Java8 on a Centos based distro. The linux is fresh and clean.

Step I have to do :

  • Get Wget
  • Install last java8

Get Wget

Very complicated…

sudo yum -y install wget

Install last java8 u112

Very complicated…

cd /opt
sudo wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "download.oracle.com/otn-pub/java/jdk/8u112-b15/jdk-8u112-linux-x64.tar.gz"
sudo tar xzf jdk-8u112-linux-x64.tar.gz
sudo rm -f jdk-8u112-linux-x64.tar.gz

Alternative is useful for choosing relevant java8 binary

cd /opt/jdk1.8.0_112/
sudo alternatives --install /usr/bin/java java /opt/jdk1.8.0_112/bin/java 2
sudo alternatives --config java << EOF
1
EOF

sudo alternatives --install /usr/bin/jar jar /opt/jdk1.8.0_112/bin/jar 2
sudo alternatives --install /usr/bin/javac javac /opt/jdk1.8.0_112/bin/javac 2
sudo alternatives --set jar /opt/jdk1.8.0_112/bin/jar
sudo alternatives --set javac /opt/jdk1.8.0_112/bin/javac

Set up environement variable

export JAVA_HOME=/opt/jdk1.8.0_112
export JRE_HOME=/opt/jdk1.8.0_112/jre
export PATH=$PATH:/opt/jdk1.8.0_112/bin:/opt/jdk1.8.0_112/jre/bin

#centos, #java

Spark first steps

This is my first try on Spark universe.

Requirement, Spark concepts, RDD, MapReduce, find these topics there https://en.wikipedia.org/wiki/MapReduce and https://en.wikipedia.org/wiki/Apache_Spark

This is an important article to well understand : https://en.wikipedia.org/wiki/Lambda_architecture

Spark has batch abilities and of course streaming.

Install Java8_101 Scala Spark

Java

cd /opt
sudo wget --no-cookies --no-check-certificate --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F; oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/8u101-b13/jdk-8u101-linux-x64.tar.gz"
sudo tar xzf jdk-8u101-linux-x64.tar.gz
sudo rm -f jdk-8u101-linux-x64.tar.gz
cd /opt/jdk1.8.0_101/
sudo alternatives --install /usr/bin/java java /opt/jdk1.8.0_101/bin/java 2
sudo alternatives --config java  /etc/profile

Scala

cd /opt
sudo wget http://downloads.typesafe.com/scala/2.11.8/scala-2.11.8.tgz
sudo tar -zxvf scala-2.11.8.tgz
sudo alternatives --install /usr/bin/scala scala /opt/scala-latest/bin/scala 2 
sudo alternatives --install /usr/bin/scalac scalac /opt/scala-latest/bin/scalac 2 
sudo alternatives --install /usr/bin/scaladoc scaladoc /opt/scala-latest/bin/scaladoc 2 
sudo alternatives --install /usr/bin/scalap scalap /opt/scala-latest/bin/scalap 2
 echo "" >> /etc/profile
echo "## Setting SCALA_HOME for all USERS ##" >> /etc/profile
echo "export SCALA_HOME=/opt/scala-latest" >> /etc/profile
source /etc/profile

Spark

 
 cd /opt
 tar xvf spark-1.3.1-bin-hadoop2.7.tgz 
 ln -s spark-2.0.1-bin-hadoop2.7 spark-latest
 export PATH=$PATH:/opt/spark-latest/bin
 source ~/.bashrc

Launch now the spark shell, with 2 threads.

spark-shell --master local[2]

I ingest a access log from a tomcat server, and split with “-“, the result is useless but it shows the powerfull api.

val inputfile = sc.textFile("/tmp/localhost_access_log.2016-10-10.txt")
val counts = inputfile.flatMap(line => line.split("-")).map(word => (word, 1)).reduceByKey(_+_);
counts.cache()
counts.saveAsTextFile("output")

In your Linux distribution you find in your local home path the directory output, and the result of the split.

Spark has a minimal web monitor tool, access with http://your.spark.ip:4040/jobs/

#scala, #spark