Announcing Komiser Entreprise Edition

In just few months, Komiser, the Cloud Environment Inspector, has become a key player in cloud cost optimization ecosystem. With more than 2,200 stars on Github and 190,000 downloads, Komiser is widely used by major companies in their production environments.

Today we’re announcing Komiser Enterprise Edition (EE), our new commercial product, has reached public beta. Komiser EE allows you to identify potential cost savings on all major public cloud providers (AWS, GCP, OVH, Azure and DigitalOcean). Komiser EE is available in three tiers: Essentials, Business and Entreprise tiers :



For consistency, we are also renaming the open source Komiser products to Komiser Community Edition (CE).

Highlights

New Landing Page



Multiple Cloud Accounts Support



Reserved Instances Advisor



Early access to features



Early Access

Komiser EE is available in early access starting today. The registration process is as simple and automated as possible, so don’t fear endless registration forms nor crowded waiting queues before trying it out. Visit our website at https://cloud.komiser.io and get started in less than a minute!

Komiser Stays Open

Komiser EE is built on top of Komiser CE, that means that Komiser continues to evolve and will stay open source. Nothing changes! We are firm believers in open source, and Komiser will continue to be our main priority and a community-driven project.

Komiser:Optimize Cost and Security on AWS

Over the last decade, the cost of Amazon Web Services (AWS) has become a primary concern of businesses. That’s no surprise: AWS has many services that offer a range of IT resources — from IT infrastructure and bandwidth to analytics tools and machine learning — and each affects the total cloud bill.

While AWS offers many fully-managed services like CloudWatch, CloudTrail, Trusted Advisor, etc to help you detect potential cost savings. Understanding and managing cloud costs isn’t simple with AWS.

That’s why, I came up one year ago, with an open source tool called Komiser to help reduce your AWS infrastructure cost based on custom recommendations.

After 1 year of intense development, I’m thrilled to announce the fresh new release of Komiser: 2.0.0 with support of new AWS services:


AWS Services supported by Komiser

Highlights



With the GDPR becoming real in EU, logging and storage of (potentially) personally identifiable information now need to be reduced in many organizations. Komiser allows you to analyze and manage cloud cost, usage, security, and governance in one place. Hence, detecting potential vulnerabilities that could put your cloud environment at risk.

It allows you also to control your usage and create visibility across all used services to achieve maximum cost-effectiveness and get a deep understanding of how you spend on the AWS, GCP and Azure.



Usage

Below are the available downloads for the latest version of Komiser (2.0.0). Please download the proper package for your operating system and architecture.

Linux:

1
wget https://cli.komiser.io/2.0.0/linux/komiser

Windows:

1
wget https://cli.komiser.io/2.0.0/windows/komiser

Mac OS X:

1
wget https://cli.komiser.io/2.0.0/osx/komiser

Note: make sure to add the execution permission to Komiser chmod +x komiser and update the user’s $PATH variable.

Komiser is also available as a Docker image:

Docker:

1
docker run -d -p 3000:3000 --name komiser mlabouardy/komiser:2.0.0

If you point your favorite browser to http://localhost:3000, you should see Komiser awesome dashboard:



The versioned documentation can be found on https://docs.komiser.io.

Komiser is written in Golang and is MIT licensed — contributions are welcomed whether that means providing feedback or testing existing and new features.


https://komiser.io

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Ingest Data from RDS MySQL to Google BigQuery

In analytics, where queries over hundreds of gigabytes are the norm, performance is paramount and has a direct effect on the productivity of your team: running a query for hours means days of iterations between business questions. At Foxintelligence, we needed to move from traditional relational databases, like Postgres and MySQL to columnar database solutions. While RDBS like MySQL is great for normal transactional operations, it has significant drawbacks when it comes to real-time analytics on large amount of data. We found Google BigQuery to deliver superior results significantly for usability, performance, and cost for almost all our analytical use-cases, especially at scale.

Both Amazon RedShift and Google BigQuery provide much of the same functionalities, there are some fundamental differences between how these two operate. So you need to pick the right solution based on your data and business.

Once we decided which data warehouse we will use, we had to replicate data from RDS MySQL to Google BigQuery. This post walks you through the process of creating a data pipeline to achieve the replication between the two systems.

We used AWS Data Pipeline to export data from MySQL and feed it to BigQuery. The figure below summarises the entire workflow:



The pipeline starts based on a defined schedule and period, it launches a spot instance that will copy data from MySQL database to CSV files (split by table name) to an Amazon S3 bucket and then sending an Amazon SNS notification after the copy activity completes successfully. Following is our pipeline that accomplishes that:



Once the pipeline is finished, CSV files will be generated in the output S3 bucket:



The SNS notification will trigger a Lambda function, it will deploy a batch job based on a Docker image stored on our private Docker registry. The container will upload CSV files from S3 to GCS and load data to BigQuery:

You can use Storage Transfer Service to easily migrate your data from Amazon S3 to Cloud Storage.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/bin/bash

echo "Download BigQuery Credentials"

aws s3 cp s3://$GCP_AUTH_BUCKET/auth.json .

echo "Upload CSV to GCS"

mkdir -p csv
rm tables

for raw in $(aws s3 ls s3://$S3_BUCKET/ | awk -F " " '{print $2}');
do
table=${raw%/}
if [[ $table != "" && $table != df* ]]
then
echo "Table: $table"
csv=$(aws s3 ls s3://$S3_BUCKET/$table/ | awk -F " " '{print $4}' | grep ^ | sort -r | head -n1)

echo $table >> tables

echo "CSV: $csv"

echo "Copy csv from S3"
aws s3 cp s3://$S3_BUCKET/$table/$csv csv/$table.csv

echo "Upload csv to GCP"
gsutil cp csv/$table.csv gs://$GS_BUCKET/$table.csv
fi
done

echo "Import CSV to BigQuery"

python app.py

We have written a Python script to clean up raw data (encoding issues), transform (map MySQL data types to BQ data types) and load CSV file to BigQuery:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import mysql.connector
import os
import time
from mysql.connector import Error
from google.cloud import bigquery

bigquery_client = bigquery.Client()

def mapToBigQueryDataType(columnType):
if columnType.startswith('int'):
return 'INT64'
if columnType.startswith('varchar'):
return 'STRING'
if columnType.startswith('decimal'):
return 'FLOAT64'
if columnType.startswith('datetime'):
return 'DATETIME'
if columnType.startswith('text'):
return 'STRING'
if columnType.startswith('date'):
return 'DATE'
if columnType.startswith('time'):
return 'TIME'

def wait_for_job(job):
while True:
job.reload()
if job.state == 'DONE':
if job.error_result:
raise RuntimeError(job.errors)
return
time.sleep(1)

try:
conn = mysql.connector.connect(host=os.environ['MYSQL_HOST'],
database=os.environ['MYSQL_DB'],
user=os.environ['MYSQL_USER'],
password=os.environ['MYSQL_PWD'])
if conn.is_connected():
print('Connected to MySQL database')

lines = open('tables').read().split("\n")
for tableName in lines:
print('Table:',tableName)

cursor = conn.cursor()
cursor.execute('SHOW FIELDS FROM '+os.environ['MYSQL_DB']+'.'+tableName)

rows = cursor.fetchall()

schema = []
for row in rows:
schema.append(bigquery.SchemaField(row[0].replace('\'', ''), mapToBigQueryDataType(row[1])))


job_config = bigquery.LoadJobConfig()
job_config.source_format = bigquery.SourceFormat.CSV
job_config.autodetect = True
job_config.max_bad_records = 2
job_config.allow_quoted_newlines = True
job_config.schema = schema

job = bigquery_client.load_table_from_uri(
'gs://'+os.environ['GCE_BUCKET']+'/'+tableName+'.csv',
bigquery_client.dataset(os.environ['BQ_DATASET']).table(tableName),
location=os.environ['BQ_LOCATION'],
job_config=job_config)

print('Loading data to BigQuery:', tableName)

wait_for_job(job)


print('Loaded {} rows into {}:{}.'.format(
job.output_rows, os.environ['BQ_DATASET'], tableName))

except Error as e:
print(e)
finally:
conn.close()

As a result, the tables will be imported to BigQuery:



While this solution worked like a charm, we didn’t stop there. Google Cloud announced the public beta release of BigQuery Data Transfer. This service allows you to automates data movement from multiple data sources like S3 or GCS to BigQuery on a scheduled, managed basis. So it was a great use case to test this service to manage recurring load jobs from Amazon S3 into BigQuery as shown in the figure below:



This services comes with some trade-offs such as Google BigQuery cannot create tables as part of data transfer process. Hence, a Lambda function was used to drop the old dataset, and create the destination tables and their schema in advance of running the transfer. The function handler code is self-explanatory:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
func handler(ctx context.Context) error {
client, err := bigquery.NewClient(ctx, os.Getenv("PROJECT_ID"))
if err != nil {
return err
}

err = RemoveDataSet(client)
if err != nil {
return err
}

err = CreateDataSet(client)
if err != nil {
return err
}

uri := fmt.Sprintf("%s:%s@tcp(%s)/%s",
os.Getenv("MYSQL_USERNAME"), os.Getenv("MYSQL_PASSWORD"),
os.Getenv("MYSQL_HOST"), os.Getenv("MYSQL_DATABASE"))

db, err := sql.Open("mysql", uri)
if err != nil {
return err
}

file, err := os.Open("tables")
if err != nil {
return err
}
defer file.Close()

scanner := bufio.NewScanner(file)
for scanner.Scan() {
tableName := scanner.Text()
fmt.Println("Table:", tableName)

columns, _ := GetColumns(tableName, db)
fmt.Println("Columns:", columns)

CreateBQTable(tableName, columns, client)
}

if err := scanner.Err(); err != nil {
return err
}
return nil
}

func main() {
lambda.Start(handler)
}

The function will be triggered by a CloudWatch Event, once the data pipeline finishes exporting CSV files:



Finally, we created Transfer jobs for each table on BigQuery to load data from S3 bucket to BigQuery table:



Using Google BigQuery to store internally hundreds of gigabytes of data (soon terabytes) with the capability to analyse it in few seconds give us a massive push toward business intelligence and data-driven insights.

Like what you’re read­ing? Check out my book and learn how to build, secure, deploy and manage production-ready Serverless applications in Golang with AWS Lambda.

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

CI/CD for Android and iOS Apps on AWS

Mobile apps have taken center stage at Foxintelligence. After implementing CI/CD workflows for Dockerized Microservices, Serverless Functions and Machine Learning models, we needed to automate the release process of our mobile application — Cleanfox — to deliver features we are working on continuously and ensure high quality app. While the CI/CD concepts remains the same, its practicalities are somewhat different. In this post, I will walk you through how we achieved that, including the lessons learned and formed along the way to boost your Android and iOS application development drastically.



The Jenkins cluster (figure below) consists of a dedicated Jenkins master with a couple of slave nodes inside an autoscaling group. However, iOS apps can be built only on macOS machine. We typically use an unused Mac Mini computer located in the office devoted to these tasks.

We have configured the Mac mini to establish a VPN connection (at system startup) to the OpenVPN server deployed on the target VPC.



We setup an SSH tunnel to the Mac node using dynamic port forwarding. Once the tunnel is active, you can add the machine to Jenkins set of worker nodes:



This guide assumes you have a fresh install of the latest stable version of Xcode along with Fastlane.

Once we had a good part of this done, we used Fastlane to automate the deployment process. This tool offers a set of scripts written in Ruby to handle tedious tasks such as code signing, managing certificates and releasing ipa to the app store for the end users.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
default_platform(:ios)

platform :ios do

lane :tests do
scan(
scheme: options[:scheme],
clean: true,
skip_detect_devices: true,
build_for_testing: true,
sdk: 'iphoneos',
should_zip_build_products: true
)
firebase_test_lab_ios_xctest(
gcp_project: 'cleanfox-XXXX',
devices: [
{
ios_model_id: 'ipadmini4',
ios_version_id: '12.0',
locale: 'fr_FR',
orientation: 'portrait'
},
{
ios_model_id: 'iphone7plus',
ios_version_id: '12.0',
locale: 'fr_FR',
orientation: 'portrait'
},
{
ios_model_id: 'iphone8',
ios_version_id: '12.0',
locale: 'fr_FR',
orientation: 'portrait'
},
{
ios_model_id: 'iphonexsmax',
ios_version_id: '12.0',
locale: 'fr_FR',
orientation: 'portrait'
}
]
)
end

lane :increment_build do
version = get_version_number
latestBuildNumber = latest_testflight_build_number(version: version)
increment_build_number(
build_number: latestBuildNumber + 1,
xcodeproj: "Cleanfox.xcodeproj"
)
end

lane :develop do
increment_build
build_app(scheme: "Sandbox",
workspace: "Cleanfox.xcworkspace",
include_bitcode: true)
end

lane :beta do
increment_build
build_app(scheme: "Staging",
workspace: "Cleanfox.xcworkspace",
include_bitcode: true)
upload_to_testflight
end

lane :prod_testflight do
increment_build_number(
build_number: latest_testflight_build_number + 1,
xcodeproj: "Cleanfox.xcodeproj"
)
build_app(scheme: "Production",
workspace: "Cleanfox.xcworkspace",
include_bitcode: true)
upload_to_testflight(skip_waiting_for_build_processing: true)
end
end

Also, we created a Jenkinsfile, which defines a set of steps (each step calls a certain actions — lane — defined in the above Fastfile) that will be executed on Jenkins based on the branch name (GitFlow model):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def bucket = 'mobile-artifacts-foxintelligence'

node('mac') {
try {
stage('Checkout') {
checkout scm
notifySlack('STARTED')
}

stage('Install Dependencies') {
sh "pod install"
}

stage('Build') {
if (env.BRANCH_NAME == 'master'){
sh "bundle exec fastlane prod_testflight"
}
if (env.BRANCH_NAME == 'preprod'){
sh "bundle exec fastlane staging"
}
if (env.BRANCH_NAME == 'develop'){
sh "bundle exec fastlane develop"
}
}

stage('Push') {
sh "aws s3 cp Cleanfox.ipa s3://${bucket}/ios/cleanfox-${commitID()}.ipa"

if (env.BRANCH_NAME == 'master'){
sh "aws s3 cp Cleanfox.ipa s3://${bucket}/ios/cleanfox-latest.ipa"
}
if (env.BRANCH_NAME == 'preprod'){
sh "aws s3 cp Cleanfox.ipa s3://${bucket}/ios/cleanfox-preprod.ipa"
}
if (env.BRANCH_NAME == 'develop'){
sh "aws s3 cp Cleanfox.ipa s3://${bucket}/ios/cleanfox-develop.ipa"
}
}

stage('Test') {
if (env.BRANCH_NAME == 'master'){
sh 'bundle fastlane tests --scheme "Production"'
}
if (env.BRANCH_NAME == 'preprod'){
sh 'bundle fastlane tests --scheme "Staging"'
}
if (env.BRANCH_NAME == 'develop'){
sh 'bundle fastlane tests --scheme "Sandbox"'
}
}
}catch(e){
currentBuild.result = 'FAILED'
throw e
}finally{
notifySlack(currentBuild.result)
}
}

def notifySlack(String buildStatus){
buildStatus = buildStatus ?: 'SUCCESSFUL'
def colorCode = '#FF0000'
def subject = "Name: '${env.JOB_NAME}'\nStatus: ${buildStatus}\nBuild ID: ${env.BUILD_NUMBER}"
def summary = "${subject}\nMessage: ${commitMessage()}\nAuthor: ${commitAuthor()}\nURL: ${env.BUILD_URL}"

if (buildStatus == 'STARTED') {
colorCode = '#546e7a'
} else if (buildStatus == 'SUCCESSFUL') {
colorCode = '#2e7d32'
} else {
colorCode = '#c62828c'
}

slackSend (color: colorCode, message: summary)
}

def commitAuthor(){
sh 'git show -s --pretty=%an > .git/commitAuthor'
def commitAuthor = readFile('.git/commitAuthor').trim()
sh 'rm .git/commitAuthor'
commitAuthor
}

def commitID() {
sh 'git rev-parse HEAD > .git/commitID'
def commitID = readFile('.git/commitID').trim()
sh 'rm .git/commitID'
commitID
}

def commitMessage() {
sh 'git log --format=%B -n 1 HEAD > .git/commitMessage'
def commitMessage = readFile('.git/commitMessage').trim()
sh 'rm .git/commitMessage'
commitMessage
}

The pipeline is divided into 5 stages:

  • Checkout: clone the GitHub repository.
  • Quality & Unit Tests: check whether our code is well formatted and follows Swift best practices and run unit tests.
  • Build: build and sign the app.
  • Push: store the deployment package (.ipa file) to an S3 bucket.
  • UI Test: launch UI tests on Firebase Test Lab across a wide variety of devices and device configurations.

If a build on the CI passes, a Slack notification will be sent (broken build will notify developers to investigate immediately).

Note the usage of the git commit ID as a name for the deployment package to give a meaningful and significant name for each release and be able to roll back to a specific commit if things go wrong.

Once the pipeline is triggered, a new build should be created as follows:



At the end, Jenkins will launch UI Tests based on XCTest framework on Firebase Test Lab across multiple virtual and physical devices and different screen sizes.



We gave a try to AWS Device Farm, but we needed to get over 2 problems at the same time. We sought waiting for a very short time, to receive tests result, without paying too much.

Test Lab exercises your app on devices installed and running in a Google data center. After your tests finish, you can see the results including logs, videos and screenshots in the Firebase console.



You can enhance the workflow to automate taking screenshots through fastlane snapshot command and saves hours of valuable time you’ll burn taking screenshots. To upload the screenshots, metadata and the IPA file to iTunes Connect, you can use deliver command, which is already installed and initialized as part of fastlane.

The Android CI/CD workflow is quite straightforward, as it needs only the JDK environment with Android SDK preinstalled, we are running the CI on a Jenkins slave deployed into an EC2 Spot instance. The pipeline contains the following stages:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
def bucket = 'mobile-artifacts-foxintelligence'

node('android') {
try {
stage('Checkout') {
checkout scm
notifySlack('STARTED')
}

stage('Clean & Prepare') {
sh "./gradlew clean"
}

stage('Quality Tests') {
sh "./gradlew lintDebug"
androidLint pattern: 'app/build/reports/lint-results-debug.xml'
}

stage('Unit Tests') {
sh "./gradlew testDebug --stacktrace"

if (env.BRANCH_NAME == 'master'){
sh "./gradlew testReleaseUnitTest"
}
if (env.BRANCH_NAME == 'preprod'){
sh "./gradlew testStagingUnitTest"
}
if (env.BRANCH_NAME == 'develop'){
sh "./gradlew testSandboxUnitTest"
}

publishHTML([reportDir: 'app/build/reports/tests/testDebugUnitTest', reportFiles: 'index.html', reportName: 'Unit Tests Report'])

junit 'app/build/test-results/testDebugUnitTest/*.xml'
}

stage('Build'){
sh "./gradlew assembleDebug"

if (env.BRANCH_NAME == 'master'){
sh "./gradlew compileReleaseKotlin"
}
if (env.BRANCH_NAME == 'preprod'){
sh "./gradlew compileStagingKotlin"
}
if (env.BRANCH_NAME == 'develop'){
sh "./gradlew compileSandboxKotlin"
}
}

stage('Push'){
sh "aws s3 cp app/build/outputs/apk/debug/app-debug.apk s3://${bucket}/android/${commitID()}.apk"
}

stage('UI Tests'){
sh "./gradlew assembleDebugAndroidTest"
sh "gcloud firebase test android run --app app/build/outputs/apk/debug/app-debug.apk --test app/build/outputs/apk/androidTest/debug/app-debug-androidTest.apk"
}
}catch(e){
currentBuild.result = 'FAILED'
throw e
}finally{
notifySlack(currentBuild.result)
}
}

The pipeline could be drawn up as the following steps:

  • Check out the working branch from a remote repository.
  • Run the code through lint to find poorly structured code that might impact the reliability, efficiency and make the code harder to maintain. The linter will produces XML files which will be parsed by the Android Lint Plugin.
  • Launch Unit Tests. The JUnit plugin provides a publisher that consumes XML test reports generated and provides some graphical visualization of the historical test results as well as a web UI for viewing test reports, tracking failures, and so on.


  • Build debug or release APK based on the current Git branch name.
  • Upload the artifact to an S3 bucket.
  • Similarly, after the instrumentation tests have finished running, the Firebase web UI will then display the results of each test — in addition to information such as a video recording of the test run, the full Logcat, and screenshots taken:


To bring down testing time (and reduce the cost), we are testing Flank to split the test suite into multiple parts and execute them in parallel across multiple devices.

Our Continuous Integration workflow is sailing now. So far we’ve found that this process strikes the right balance. It automates the repetitive aspects, provides protection but is still lightweight and flexible. The last thing we want is the ability to ship at any time. We have an additional stage to upload the iOS artifact to Test Flight for distribution to our awesome beta tests.

Like what you’re read­ing? Check out my book and learn how to build, secure, deploy and manage production-ready Serverless applications in Golang with AWS Lambda.

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Why you should join Foxintelligence at the AWS Summit Paris 2019



How to play PUBG on AWS

AWS GPU instances are known for deep learning purposes but they can also be used for running video games. This tutorial goes through how to set up your own EC2 GPU optimised instance to run the top-selling and most played game “PlayerUnknown’s Battlegrounds (PUBG)”.

To get started, make sure you are in the AWS region closest to you, select Microsoft Windows Server to be the AMI and set the instance type to be g2.2xlarge. The instance is backed by Nvidia Grid GPU (Kepler GK104), 8x hardware hyper-threads from an Intel Xeon E5–2670 and 15GB of RAM.



For games with resource-intensive, you should use the next generation of GPU instances: P2, P3 and G3 (have up to 4 NVIDIA Tesla M60 GPUs).

After this is done, click on “Launch Instances”, and you should see a screen showing that your instance is been created:



To connect to your Windows instance, you must retrieve the initial administrator password and specify this password when you connect to your instance using Remote Desktop:

Before you attempt to log in using Remote Desktop Connection, you must open port 3389 on the security group attached to your instance



After you connect, install Microsoft Direct X11 after installing Chrome (it saves a lot of time):



Next, install the graphic driver for maximum gaming performance:



Once installed, make sure to reboot the instance for changes to take effect:



Then, install Steam, login using your account and install PUBG from the “Library” section:



You can take advantage of AWS high network performance (up to 10 Gbps of bandwidth):



Once the game is installed, you can play PUBG on your virtualized GPU instance:



You can take this further, and use Steam In-Home Streaming feature to stream your game from your EC2 instance to your Mac:



Enjoy the game ! you can now play your games on any device connected to the same network:



You might want to bake an AMI based on your instance to avoid set it up all again the next time you want to play and use spot instances to reduce the instance cost. Also, make sure to stop your instances when you’re done for the day to avoid incurring charges. GPU instances are costly (disk storage also costs something, and can be significant if you have a large disk footprint).

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Docker on Elastic Beanstalk Tips

AWS Elastic Beanstalk is one of the most used PaaS today, it allows you to deploy your application without provisioning the underlying infrastructure while maintaining the high availability of your application. However, it’s painful to use due to the lack of documentation and real-world scenarios. In this post, I will walk you through how to use Elastic Beanstalk to deploy Docker containers from scratch. Followed by how to automate your deployment process with a Continuous Integration pipeline. At the end of this post, you should be familiar with advanced topics like debugging and monitoring of your applications in EB.



1 – Environment Setup

To get started, create a new Application using the following AWS CLI command:

1
2
aws elasticbeanstalk create-application --application-name avengers \
--region eu-west-3

Create a new environment. Let’s call it “staging” :

1
2
3
4
5
aws elasticbeanstalk create-environment --application-name avengers \
--environment-name staging \
--solution-stack-name "64bit Amazon Linux 2017.09 v2.9.2 running Docker 17.12.0-ce" \
--option-settings file://options.json \
--region eu-west-3

Head back to AWS Elastic Beanstalk Console, your new environment should be created:



Point your browser to the environment URL, a sample Docker application should be displayed:



Let’s deploy our application. I wrote a small web application in Go to return a list of Marvel Avengers (I see you Thanos 😉 )

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
package main

import (
"encoding/json"
"io/ioutil"
"log"
"net/http"
)

type Avenger struct {
Character string `json:"character"`
Name string `json:"name"`
}

var avengers []Avenger

func init() {
data, _ := ioutil.ReadFile("avengers.json")
json.Unmarshal(data, &avengers)
}

func IndexHandler(w http.ResponseWriter, r *http.Request) {
response, _ := json.Marshal(avengers)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(200)
w.Write(response)
}

func main() {
http.HandleFunc("/", IndexHandler)
if err := http.ListenAndServe(":3000", nil); err != nil {
log.Fatal(err)
}
}

Next, we will create a Dockerfile to build the Docker image. Go is a compiled language, therefore we can use the Docker multi-stage feature to build a lightweight Docker image:

1
2
3
4
5
6
7
8
9
10
11
12
13
FROM golang:1.10 as builder
WORKDIR /go/src/github.com/mlabouardy/docker-eb-ci-mon
COPY main.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
MAINTAINER mlabouardy
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /go/src/github.com/mlabouardy/docker-eb-ci-mon/app .
COPY avengers.json .
EXPOSE 3000
CMD ["./app"]

Next, we create a Dockerrun.aws.json that describes how the container will be deployed in Elastic Beanstalk:

1
2
3
4
5
6
7
8
9
10
11
12
{
"AWSEBDockerrunVersion": "1",
"Image": {
"Name": "mlabouardy/avengers",
"Update": "true"
},
"Ports": [
{
"ContainerPort": "3000"
}
]
}

Now the application is defined, create an application bundle by creating a ZIP package:

1
zip -r deployment.zip .

Then, create a S3 bucket to store the different versions of your application bundles:

1
aws s3 mb s3://avengers-docker-eb --region AWS_REGION

And create a new application version from the application bundle:

1
2
3
4
5
aws elasticbeanstalk create-application-version --application-name avengers \
--version-label v1 \
--source-bundle S3Bucket="avengers-docker-eb",S3Key="deployment.zip" \
--auto-create-application \
--region AWS_REGION


Finally, deploy the version to the staging environment:

1
2
3
aws elasticbeanstalk update-environment --application-name avengers \
--environment-name staging \
--version-label v1 --region AWS_REGION

Give it a few seconds while it’s deploying the new version:



Then, repoint your browser to the environment URL, a list of Avengers will be returned in a JSON format as follows:



Now that our Docker application is deployed, let’s automate this process by setting up a CI/CD pipeline.

2 – CI/CD Pipeline

I opt for CircleCI, but you’re free to use whatever CI server you’re familiar with. The same steps can be applied.

Create a circle.yml file with the following content:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
version: 2
jobs:
build:
docker:
- image: circleci/golang:1.10

working_directory: /go/src/github.com/mlabouardy/docker-eb-ci-mon

steps:
- checkout

- setup_remote_docker

- run:
name: Install AWS CLI
command: |
sudo apt-get update
sudo apt-get install -y awscli

- run:
name: Test
command: go test

- run:
name: Build
command: docker build -t mlabouardy/avengers:latest .

- run:
name: Push
command: |
docker login -u$DOCKERHUB_LOGIN -p$DOCKERHUB_PASSWORD
docker tag mlabouardy/avengers:latest mlabouardy/avengers:${CIRCLE_SHA1}
docker push mlabouardy/avengers:latest
docker push mlabouardy/avengers:${CIRCLE_SHA1}

- run:
name: Deploy
command: |
zip -r deployment-${CIRCLE_SHA1}.zip .
aws s3 cp deployment-${CIRCLE_SHA1}.zip s3://avengers-docker-eb --region eu-west-3
aws elasticbeanstalk create-application-version --application-name avengers \
--version-label ${CIRCLE_SHA1} --source-bundle S3Bucket="avengers-docker-eb",S3Key="deployment-${CIRCLE_SHA1}.zip" --region eu-west-3
aws elasticbeanstalk update-environment --application-name avengers \
--environment-name staging --version-label ${CIRCLE_SHA1} --region eu-west-3

The pipeline will firstly prepare the environment, installing the AWS CLI. Then run unit tests. Next, a Docker image will be built, then pushed to DockerHub. Last step is creating a new application bundle and deploying the bundle to Elastic Beanstalk.

In order to grant Circle CI permissions to call AWS operations, we need to create a new IAM user with following IAM policy:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"elasticbeanstalk:*",
"s3:*",
"ec2:*",
"cloudformation:*",
"autoscaling:*",
"elasticloadbalancing:*"
],
"Resource": "*"
}
]
}

Generate AWS access & secret keys. Then, head back to Circle CI and click on the project settings and paste the credentials :



Now, everytime you push a change to your code repository, a build will be triggered:



And a new version will be deployed automatically to Elastic Beanstalk:



3 – Monitoring

Monitoring your applications is mandatory. Unfortunately, CloudWatch doesn’t expose useful metrics like Memory usage of your applications in Elastic Beanstalk. Hence, in this part, we will solve this issue by creating our custom metrics.

I will install a data collector agent on the instance. The agent will collect metrics and push them to a time-series database.

To install the agent, we will use .ebextensions folder, on which we will create 3 configuration files:

  • 01-install-telegraf.config: install Telegraf on the instance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
container_commands:
01downloadpackage:
command: "wget https://dl.influxdata.com/telegraf/releases/telegraf-1.6.0-1.x86_64.rpm -O /tmp/telegraf.rpm"
ignoreErrors: true
02installpackage:
command: "yum localinstall -y /tmp/telegraf.rpm"
ignoreErrors: true
03removepackage:
command: "rm /tmp/telegraf.rpm"
ignoreErrors: true
04enablereboot:
command: "chkconfig telegraf on"
ignoreErrors: true
05fixpermission:
command: "usermod -a -G docker telegraf"
ignoreErrors: true
  • 02-config-file.config: create a Telegraf configuration file to collect system usage & docker containers metrics.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
files:
"/etc/telegraf/telegraf.conf":
mode: "000666"
owner: root
group: root
content: |
[global_tags]
hostname="Avengers"

# Read metrics about CPU usage
[[inputs.cpu]]
percpu = false
totalcpu = true
fieldpass = [ "usage*" ]
name_suffix = "_vm"

# Read metrics about disk usagee
[[inputs.disk]]
fielddrop = [ "inodes*" ]
mount_points=["/"]
name_suffix = "_vm"

# Read metrics about network usage
[[inputs.net]]
interfaces = [ "eth0", "eth1" ]
fielddrop = [ "icmp*", "ip*", "tcp*", "udp*" ]
name_suffix = "_vm"

# Read metrics about memory usage
[[inputs.mem]]
name_suffix = "_vm"

# Read metrics about swap memory usage
[[inputs.swap]]
name_suffix = "_vm"

# Read metrics about system load and uptime
[[inputs.system]]
name_suffix = "_vm"

# Read metrics from docker socket api
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
container_names = []
name_suffix = "_docker"

[[outputs.influxdb]]
database = "instances"
urls = ["http://172.31.38.51:8086"]
namepass = ["*_vm"]

[[outputs.influxdb]]
database = "containers"
urls = ["http://172.31.38.51:8086"]
namepass = ["*_docker"]
  • 03-start-telegraf.config: start Telegraf agent.
1
2
3
4
container_commands:
01starttelegraf:
command: "service telegraf start"
ignoreErrors: true

Once the application version is deployed to Elastic Beanstalk, metrics will be pushed to your timeseries database. In this example, I used InfluxDB as data storage and I created some dynamic Dashboards in Grafana to visualize metrics in real-time:

Containers:



Hosts:



Note: for in-depth explaination on how to configure Telegraf, InfluxDB & Grafana read my previous article.

Full code can be found on my GitHub. Make sure to drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy

Infrastructure Cost Optimization with Lambda

Having multiple environments is important to build a continuous integration/deployment pipeline and be able to reproduce bugs in production with ease but this comes at price. In order to reduce cost of AWS infrastructure, instances which are running 24/7 unnecessarily (sandbox & staging environments) must be shut down outside of regular business hours.

The figure below describes an automated process to schedule, stop and start instances to help cutting costs. The solution is a perfect example of using Serverless computing.



Note: full code is available on my GitHub.

2 Lambda functions will be created, they will scan all environments looking for a specific tag. The tag we use is named ‘Environment’. Instances without an Environment tag will not be affected:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
func getInstances(cfg aws.Config) ([]Instance, error) {
instances := make([]Instance, 0)

svc := ec2.New(cfg)
req := svc.DescribeInstancesRequest(&ec2.DescribeInstancesInput{
Filters: []ec2.Filter{
ec2.Filter{
Name: aws.String("tag:Environment"),
Values: []string{os.Getenv("ENVIRONMENT")},
},
},
})
res, err := req.Send()
if err != nil {
return instances, err
}

for _, reservation := range res.Reservations {
for _, instance := range reservation.Instances {
for _, tag := range instance.Tags {
if *tag.Key == "Name" {
instances = append(instances, Instance{
ID: *instance.InstanceId,
Name: *tag.Value,
})
}
}
}
}

return instances, nil
}

The StartEnvironment function will query the StartInstances method with the list of instance ids returned by the previous function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func startInstances(cfg aws.Config, instances []Instance) error {
instanceIds := make([]string, 0, len(instances))
for _, instance := range instances {
instanceIds = append(instanceIds, instance.ID)
}

svc := ec2.New(cfg)
req := svc.StartInstancesRequest(&ec2.StartInstancesInput{
InstanceIds: instanceIds,
})
_, err := req.Send()
if err != nil {
return err
}
return nil
}

Similarly, the StopEnvironment function will query the StopInstances method:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func stopInstances(cfg aws.Config, instances []Instance) error {
instanceIds := make([]string, 0, len(instances))
for _, instance := range instances {
instanceIds = append(instanceIds, instance.ID)
}

svc := ec2.New(cfg)
req := svc.StopInstancesRequest(&ec2.StopInstancesInput{
InstanceIds: instanceIds,
})
_, err := req.Send()
if err != nil {
return err
}
return nil
}

Finally, both functions will post a message to Slack channel for real-time notification:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
func postToSlack(color string, title string, instances string) error {
message := SlackMessage{
Text: title,
Attachments: []Attachment{
Attachment{
Text: instances,
Color: color,
},
},
}

client := &http.Client{}
data, err := json.Marshal(message)
if err != nil {
return err
}

req, err := http.NewRequest("POST", os.Getenv("SLACK_WEBHOOK"), bytes.NewBuffer(data))
if err != nil {
return err
}

resp, err := client.Do(req)
if err != nil {
return err
}

return nil
}

Now our functions are defined, let’s build the deployment packages (zip files) using the following Bash script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash

echo "Building StartEnvironment binary"
GOOS=linux GOARCH=amd64 go build -o main start/*.go

echo "Creating deployment package"
zip start-environment.zip main
rm main

echo "Building StopEnvironment binary"
GOOS=linux GOARCH=amd64 go build -o main stop/*.go

echo "Creating deployment package"
zip stop-environment.zip main
rm main

The functions require an IAM role to be able to interact with EC2. The StartEnvironment function has to be able to describe and start EC2 instances:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"ec2:DescribeInstances",
"ec2:StartInstances"
],
"Resource": [
"*"
]
}
]
}

The StopEnvironment function has to be able to describe and stop EC2 instances:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"ec2:DescribeInstances",
"ec2:StopInstances"
],
"Resource": [
"*"
]
}
]
}

Finally, create an IAM role for each function and attach the above policies:

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/bash

echo "IAM role for StartEnvironment"
arn=$(aws iam create-policy --policy-name StartEnvironment --policy-document file://start/policy.json | jq -r '.Policy.Arn')
result=$(aws iam create-role --role-name StartEnvironmentRole --assume-role-policy-document file://role.json | jq -r '.Role.Arn')
aws iam attach-role-policy --role-name StartEnvironmentRole --policy-arn $arn
echo "ARN: $result"

echo "IAM role for StopEnvironment"
arn=$(aws iam create-policy --policy-name StopEnvironment --policy-document file://stop/policy.json | jq -r '.Policy.Arn')
result=$(aws iam create-role --role-name StopEnvironmentRole --assume-role-policy-document file://role.json | jq -r '.Role.Arn')
aws iam attach-role-policy --role-name StopEnvironmentRole --policy-arn $arn
echo "ARN: $result"

The script will output the ARN for each IAM role:



Before jumping to deployment part, we need to create a Slack WebHook to be able to post messages to Slack channel:



Next, use the following script to deploy your functions to AWS Lambda (make sure to replace the IAM roles, Slack WebHook token & the target environment):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/bin/bash

START_IAM_ROLE="arn:aws:iam::ACCOUNT_ID:role/StartEnvironmentRole"
STOP_IAM_ROLE="arn:aws:iam::ACCOUNT_ID:role/StopEnvironmentRole"
AWS_REGION="us-east-1"
SLACK_WEBHOOK="https://hooks.slack.com/services/TOKEN"
ENVIRONMENT="sandbox"

echo "Deploying StartEnvironment to Lambda"
aws lambda create-function --function-name StartEnvironment \
--zip-file fileb://./start-environment.zip \
--runtime go1.x --handler main \
--role $START_IAM_ROLE \
--environment Variables="{SLACK_WEBHOOK=$SLACK_WEBHOOK,ENVIRONMENT=$ENVIRONMENT}" \
--region $AWS_REGION


echo "Deploying StopEnvironment to Lambda"
aws lambda create-function --function-name StopEnvironment \
--zip-file fileb://./stop-environment.zip \
--runtime go1.x --handler main \
--role $STOP_IAM_ROLE \
--environment Variables="{SLACK_WEBHOOK=$SLACK_WEBHOOK,ENVIRONMENT=$ENVIRONMENT}" \
--region $AWS_REGION \


rm *-environment.zip

Once deployed, if you sign in to AWS Management Console, navigate to Lambda Console, you should see both functions has been deployed successfully:

StartEnvironment:



StopEnvironment:



To further automate the process of invoking the Lambda function at the right time. AWS CloudWatch Scheduled Events will be used.

Create a new CloudWatch rule with the below cron expression (It will be invoked everyday at 9 AM):



And another rule to stop the environment at 6 PM:



Note: All times are GMT time.

Testing:

a – Stop Environment



Result:



b – Start Environment



Result:



The solution is easy to deploy and can help reduce operational costs.

Full code can be found on my GitHub. Make sure to drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Build a Serverless Production-Ready Blog

Are you tired of maintaining your CMS (WordPress, Drupal, etc) ? Paying expensive hosting fees? Fixing security issues everyday ?



I discovered not long ago a new blogging framework called Hexo which let you publish Markdown documents in the form of blog post. So as always I got my hands dirty and wrote this post to show you how to build a production-ready blog with Hexo and use the AWS S3 to make your blog Serverless and pay only per usage. Along the way, I will show you how to automate the deployment of new posts by setting up a CI/CD pipeline

To get started, Hexo requires Node.JS & Git to be installed. Once all requirements are installed, issue the following command to install Hexo CLI:

1
npm install -g hexo-cli

Next, create a new empty project:

1
hexo init slowcoder.com

Modify blog global settings in _config.yml file:

1
2
3
4
5
6
7
8
# Site
title: SlowCoder
subtitle: DevOps News and Tutorials
description: DevOps, Cloud, Serverless, Containers news and tutorials for everyone
keywords: programming,devops,cloud,go,mobile,serverless,docker
author: Mohamed Labouardy
language: en
timezone: Europe/Paris

Start a local server with “hexo server“. By default, this is at http://localhost:4000. You’ll see Hexo’s pre-defined “Hello World” test post:



If you want to change the default theme, you just need to go here and find a new one you prefer.

I opt for Magnetic Theme as it includes many features:

  • Disqus and Facebook comments
  • Google Analytics
  • Cover image for posts and pages
  • Tags Support
  • Responsive Images
  • Image Gallery
  • Social Accounts configuration
  • Pagination

Clone the theme GitHub repository as below:

1
git clone https://github.com/klugjo/hexo-theme-magnetic themes/magnetic

Then update your blog’s main _config.yml to set the theme to magnetic. Once done, restart the server:



Now you are almost done with your blog setup. It is time to write your first article. To generate a new article file, use the following command:

1
hexo new POST_TITLE


Now, sign in to AWS Management Console, navigate to S3 Dashboard and create an S3 Bucket or use the AWS CLI to create a new one:

1
aws s3 mb s3://slowcoder.com

Add the following policy to the S3 bucket to make all objects public by default:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"Id": "Policy1522074684919",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1522074683215",
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": "arn:aws:s3:::slowcoder.com/*",
"Principal": "*"
}
]
}

Next, enable static website hosting on the S3 bucket:

1
aws s3 website s3://slowcoder.com --index-document index.html

In order to automate the process of deployment of the blog to S3 each time a new article is been published. We will setup a CI/CD pipeline using CircleCI.

Sign in to CircleCI using your GitHub account, then add the circle.yml file to your project:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
version: 2
jobs:
build:
docker:
- image: node:9.9.0
working_directory: ~/slowcoder.com
steps:
- checkout
- run:
name: Install Hexo CLI
command: npm install -g hexo-cli
- restore_cache:
keys:
- npm-deps-{{ checksum "package.json" }}
- run:
name: Install dependencies
command: npm install
- save_cache:
key: npm-deps-{{ checksum "package.json" }}
paths:
- node_modules
- run:
name: Generate static website
command: hexo generate
- run:
name: Install AWS CLI
command: |
apt-get update
apt-get install -y awscli
- run:
name: Push to S3 bucket
command: cd public/ && aws s3 sync . s3://slowcoder.com

Note: Make sure to set the AWS Access Key ID and Secret Access Key in your Project’s Settings page on CircleCI (s3:PutObject permission).

Now every time you push changes to your GitHub repo, CircleCI will automatically deploy the changes to S3. Here’s a passing build:



Finally, to make our blog user-friendly, we will setup a custom domain name in Route53 as below:



Note: You can go further and setup a CloudFront Distribution in front of the S3 bucket to optimize delivery of blog assets.

You can test your brand new blog now by typing the following adress: http://slowcoder.com :



Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Add new users to EC2 and give SSH Key access

In this quick post, I will show you how to add a new user to an EC2 instance and SSH with your own private key rather than having to authenticate using the private key generated by AWS.



Connect via SSH into your instance using its public IP:



Next, create a new user using the following command:

1
sudo adduser labouardy


Next, we switch the shell session to the new account:

1
sudo su labouardy

Create .ssh directory, and change the directory permission to 700 (only the file owner can read, write or open the directory):

1
2
mkdir .ssh
chmod 700 .ssh

Note: ensure you are in the new user’s home directory (example: /home/labouardy)

Create an empty file called authorized_keys in the .ssh directory and change its permissions to 600 (only the file owner can read or writ eto the file)

1
2
touch authorized_keys
chmod 600 authorized_keys


Finally, edit the authorized_keys file and past in your public key:



Once you’ve done this, exist out back to your machine, then try to SSH using the the new credential and user account you’ve created:



We now are logged in as user labouardy 😄

Drop your comments, feedback, or suggestions below — or connect with me directly on Twitter @mlabouardy.

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×