Install RabbitMQ and Minimal Erlang on Amazon Linux

The RabbitMQ website provides instructions on how to install the service on CentOS and Ubuntu Elastic Compute Cloud (EC2) instances. While the Amazon Linux distro uses CentOS as a base, it is different enough to make installing RabbitMQ tricky for system admins. I have identified and addressed the challenges here, and provide instructions on how to install RabbitMQ on Amazon Linux without dificulty.

  1. Determine the init system
  2. Set up a simple RPM build environment
  3. Build and install the minimal Erlang runtime
  4. Install and configure RabbitMQ
  5. Create and deploy a RabbitMQ Security Group

1. Determine the init system

I can boil all of the confusion down to the fact that CentOS changed its init system between the evolution of CentOS 6 to CentOS 7. If you are not a rabid CentOS follower, you would not know this, and not realize that one change would be the root cause of installation pain. Amazon Linux currently runs a version of CentOS 6, and therefore uses the original sysvinit system. The current CentOS 7 runs systemd. You do not need to know the difference between the two, but rather, which version Amazon Linux supports.

Run the following command.

[ec2-user@ip-172-31-4-69 ~]$ if (pidof /sbin/init) ; then echo "sysvinit"; elif (pidof systemd); then echo "systemd"; fi | sed -n '1!p'
sysvinit
[ec2-user@ip-172-31-4-69 ~]$

 

As of May 2017, Amazon Linux uses sysvinit. In order to accomodate sysvinit, you need to download RPMs made for CentOS 6 (i.e. include el6 in the name).

2. Set up an RPM build system

First, install the tools you need to build an RPM.

[ec2-user@ip-172-31-4-69 ~]$ sudo yum -y install rpm-build redhat-rpm-config
Loaded plugins: priorities, update-motd, upgrade-helper
amzn-main                                                                              | 2.1 kB  00:00:00
amzn-updates                                                                           | 2.3 kB  00:00:00
Resolving Dependencies

...

Installed:
  rpm-build.x86_64 0:4.11.3-21.75.amzn1              system-rpm-config.noarch 0:9.0.3-42.28.amzn1

Dependency Installed:
  elfutils.x86_64 0:0.163-3.18.amzn1 elfutils-libs.x86_64 0:0.163-3.18.amzn1   gdb.x86_64 0:7.6.1-64.33.amzn1
  patch.x86_64 0:2.7.1-8.9.amzn1     perl-Thread-Queue.noarch 0:3.02-2.5.amzn1

Complete!
[ec2-user@ip-172-31-4-69 ~]$

 

Now, create the build environment. Here, you are creating the needed sub directories for a build environment. For details, see https://wiki.centos.org/HowTos/SetupRpmBuildEnvironment

[ec2-user@ip-172-31-4-69 ~]$ cd
[ec2-user@ip-172-31-4-69 ~]$ mkdir -p ~/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
[ec2-user@ip-172-31-4-69 ~]$ echo '%_topdir %(echo $HOME)/rpmbuild' > ~/.rpmmacros
[ec2-user@ip-172-31-4-69 ~]$ cat .rpmmacros
%_topdir %(echo $HOME)/rpmbuild
[ec2-user@ip-172-31-4-69 ~]$ ls rpmbuild/
BUILD  RPMS  SOURCES  SPECS  SRPMS
[ec2-user@ip-172-31-4-69 ~]$

 

Now install the development tools.

[ec2-user@ip-172-31-4-69 ~]$ sudo yum -y install autoconf gcc git ncurses-devel openssl-devel
Loaded plugins: priorities, update-motd, upgrade-helper
amzn-main                                                                              | 2.1 kB  00:00:00
amzn-updates                                                                           | 2.3 kB  00:00:00
Resolving Dependencies
--> Running transaction check

...


Installed:
  autoconf.noarch 0:2.69-11.9.amzn1                   gcc.noarch 0:4.8.3-3.20.amzn1
  git.x86_64 0:2.7.4-1.47.amzn1                       ncurses-devel.x86_64 0:5.7-4.20090207.14.amzn1
  openssl-devel.x86_64 1:1.0.1k-15.99.amzn1


Dependency Installed:
  cpp48.x86_64 0:4.8.3-9.111.amzn1                       gcc48.x86_64 0:4.8.3-9.111.amzn1
  glibc-devel.x86_64 0:2.17-157.169.amzn1                glibc-headers.x86_64 0:2.17-157.169.amzn1
  kernel-headers.x86_64 0:4.9.27-14.31.amzn1             keyutils-libs-devel.x86_64 0:1.5.8-3.12.amzn1
  krb5-devel.x86_64 0:1.14.1-27.41.amzn1                 libcom_err-devel.x86_64 0:1.42.12-4.40.amzn1
  libkadm5.x86_64 0:1.14.1-27.41.amzn1                   libselinux-devel.x86_64 0:2.1.10-3.22.amzn1
  libsepol-devel.x86_64 0:2.1.7-3.12.amzn1               libgomp.x86_64 0:4.8.3-9.111.amzn1
  libmpc.x86_64 0:1.0.1-3.3.amzn1                        libverto-devel.x86_64 0:0.2.5-4.9.amzn1
  m4.x86_64 0:1.4.16-9.10.amzn1                          mpfr.x86_64 0:3.1.1-4.14.amzn1
  perl-Data-Dumper.x86_64 0:2.145-3.5.amzn1              perl-Error.noarch 1:0.17020-2.9.amzn1
  perl-Git.noarch 0:2.7.4-1.47.amzn1                     perl-TermReadKey.x86_64 0:2.30-20.9.amzn1
  zlib-devel.x86_64 0:1.2.8-7.18.amzn1 
  
  
Complete!
[ec2-user@ip-172-31-4-69 ~]$

 

Pull the source code for minimal Erlang from git.

[ec2-user@ip-172-31-4-69 ~]$ git clone https://github.com/rabbitmq/erlang-rpm.git
Cloning into 'erlang-rpm'...
remote: Counting objects: 258, done.
remote: Total 258 (delta 0), reused 0 (delta 0), pack-reused 258
Receiving objects: 100% (258/258), 55.33 KiB | 0 bytes/s, done.
Resolving deltas: 100% (147/147), done.
Checking connectivity... done.
[ec2-user@ip-172-31-4-69 ~]$

 

3. Build and install the minimal Erlang runtime

Change directories to erlang-rpm to start the build.

[ec2-user@ip-172-31-4-69 ~]$ cd erlang-rpm/
[ec2-user@ip-172-31-4-69 erlang-rpm]$

 

Execute a make to build the thing. If you encounter any errors, 99.99% of the time the error will be due to missing packages. Simply read the error to identify the missing package and then install that package and execute make once more.

[ec2-user@ip-172-31-4-69 erlang-rpm]$ make
rm -rf BUILDROOT BUILD SOURCES SPECS SRPMS RPMS tmp FINAL_RPMS dist
mkdir -p BUILD SOURCES SPECS SRPMS RPMS tmp dist
wget -O dist/OTP-19.3.4.tar.gz https://github.com/erlang/otp/archive/OTP-19.3.4.tar.gz#
--2017-05-26 17:30:16--  https://github.com/erlang/otp/archive/OTP-19.3.4.tar.gz
Resolving github.com (github.com)... 192.30.253.113, 192.30.253.112
Connecting to github.com (github.com)|192.30.253.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/erlang/otp/tar.gz/OTP-19.3.4 [following]
--2017-05-26 17:30:16--  https://codeload.github.com/erlang/otp/tar.gz/OTP-19.3.4
Resolving codeload.github.com (codeload.github.com)... 192.30.253.120, 192.30.253.121
Connecting to codeload.github.com (codeload.github.com)|192.30.253.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: ‘dist/OTP-19.3.4.tar.gz’

dist/OTP-19.3.4.tar.gz          [                <=>                       ]  32.42M  7.73MB/s    in 4.2s

...

 

For example, the first time I tried to build the erlang-rpm, I got the following error about not finding crypto libraries.

RPM build errors:
    bogus date in %changelog: Thu Oct 13 2015 Michael Klishin <michael@rabbitmq.com> - 18.1
    Directory not found by glob: /home/ec2-user/erlang-rpm/BUILDROOT/erlang-19.3.4-1.amzn1.x86_64/usr/lib64/erlang/lib/crypto-*/
    Directory not found by glob: /home/ec2-user/erlang-rpm/BUILDROOT/erlang-19.3.4-1.amzn1.x86_64/usr/lib64/erlang/lib/ssl-*/
    File not found by glob: /home/ec2-user/erlang-rpm/BUILDROOT/erlang-19.3.4-1.amzn1.x86_64/usr/lib64/erlang/lib/ssl-*/ebin
    File not found by glob: /home/ec2-user/erlang-rpm/BUILDROOT/erlang-19.3.4-1.amzn1.x86_64/usr/lib64/erlang/lib/ssl-*/src
make: *** [erlang] Error 1

 

A quick Google search for “rpm build errors file not found buildroot crypto” leads me to the following page with the following solution:

 

It turns out during my first attempt, I negleted to install openssl-devel. To fix the Error, I installed openssl-devel

[ec2-user@ip-172-31-4-69 erlang-rpm]$ sudo yum -y install openssl-devel
Loaded plugins: priorities, update-motd, upgrade-helper
amzn-main                                                                              | 2.1 kB  00:00:00
amzn-updates                                                                           | 2.3 kB  00:00:00
Resolving Dependencies
--> Running transaction check

...


Installed:
  openssl-devel.x86_64 1:1.0.1k-15.99.amzn1

Dependency Installed:
  keyutils-libs-devel.x86_64 0:1.5.8-3.12.amzn1            krb5-devel.x86_64 0:1.14.1-27.41.amzn1
  libcom_err-devel.x86_64 0:1.42.12-4.40.amzn1             libkadm5.x86_64 0:1.14.1-27.41.amzn1
  libselinux-devel.x86_64 0:2.1.10-3.22.amzn1              libsepol-devel.x86_64 0:2.1.7-3.12.amzn1
  libverto-devel.x86_64 0:0.2.5-4.9.amzn1                  zlib-devel.x86_64 0:1.2.8-7.18.amzn1

Complete!
[ec2-user@ip-172-31-4-69 erlang-rpm]$

 

…and run make again (from the erlang-rpm directory).

After a while the compile will succeed. You will see success.

Wrote: /home/ec2-user/erlang-rpm/RPMS/x86_64/erlang-19.3.4-1.amzn1.x86_64.rpm
Wrote: /home/ec2-user/erlang-rpm/RPMS/x86_64/erlang-debuginfo-19.3.4-1.amzn1.x86_64.rpm
Executing(%clean): /bin/sh -e /home/ec2-user/erlang-rpm/tmp/rpm-tmp.ekgXf8
+ umask 022
+ cd /home/ec2-user/erlang-rpm/BUILD
+ cd otp-OTP-19.3.4
+ rm -rf /home/ec2-user/erlang-rpm/BUILDROOT/erlang-19.3.4-1.amzn1.x86_64
+ exit 0
find RPMS -name "*.rpm" -exec sh -c 'mv {} `echo {} | sed 's#^RPMS\/noarch#FINAL_RPMS#'`' ';'
mv: ‘RPMS/x86_64/erlang-debuginfo-19.3.4-1.amzn1.x86_64.rpm’ and ‘RPMS/x86_64/erlang-debuginfo-19.3.4-1.amzn1.x86_64.rpm’ are the same file
mv: ‘RPMS/x86_64/erlang-19.3.4-1.amzn1.x86_64.rpm’ and ‘RPMS/x86_64/erlang-19.3.4-1.amzn1.x86_64.rpm’ are the same file

 

Before you install Erlang, delete any old versions.

[ec2-user@ip-172-31-4-69 erlang-rpm]$ sudo yum -y remove erlang-*
Loaded plugins: priorities, update-motd, upgrade-helper
No Match for argument: erlang-*
No Packages marked for removal
[ec2-user@ip-172-31-4-69 erlang-rpm]$

 

Now, install the Erlang RPM you just built. You will find it in the RPMS/x86_64/ directory. It will most likely have a different name than the one I use below. Either way, notice that the RPM includes amzn1 in its filename.

[ec2-user@ip-172-31-4-69 erlang-rpm]$ sudo yum -y install RPMS/x86_64/erlang-19.3.4-1.amzn1.x86_64.rpm
Loaded plugins: priorities, update-motd, upgrade-helper
Examining RPMS/x86_64/erlang-19.3.4-1.amzn1.x86_64.rpm: erlang-19.3.4-1.amzn1.x86_64
Marking RPMS/x86_64/erlang-19.3.4-1.amzn1.x86_64.rpm to be installed
Resolving Dependencies

...

Running transaction
  Installing : erlang-19.3.4-1.amzn1.x86_64                                                               1/1
  Verifying  : erlang-19.3.4-1.amzn1.x86_64                                                               1/1

Installed:
  erlang.x86_64 0:19.3.4-1.amzn1

Complete!
[ec2-user@ip-172-31-4-69 erlang-rpm]$

 

4. Install and configure RabbitMQ

You can follow the instructions on the RabbitMQ web site to install the service. Remember, in step one we discovered that the current version of Amazon linux uses sysvinit. We, therefore, need to download the CentOS 6/ EL6 RPM.

 

If you run sysvinit, then download the RabbitMQ RPM with el6 in the name. If you run systemd, download the RabbitMQ RPM with el7 in the name.

 

Change directories and then wget the RPM. You may have a different URL from this blog post.  Go to https://www.rabbitmq.com/install-rpm.html to fetch the most recent RPM URL.

 

 

[ec2-user@ip-172-31-4-69 erlang-rpm]$ cd
[ec2-user@ip-172-31-4-69 ~]$ wget https://www.rabbitmq.com/releases/rabbitmq-server/v3.6.10/rabbitmq-server-3.6.10-1.el6.noarch.rpm
--2017-05-26 18:21:28--  https://www.rabbitmq.com/releases/rabbitmq-server/v3.6.10/rabbitmq-server-3.6.10-1.el6.noarch.rpm
Resolving www.rabbitmq.com (www.rabbitmq.com)... 192.240.153.117
Connecting to www.rabbitmq.com (www.rabbitmq.com)|192.240.153.117|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4931483 (4.7M) [application/x-redhat-package-manager]
Saving to: ‘rabbitmq-server-3.6.10-1.el6.noarch.rpm’

rabbitmq-server-3.6.10-1.el 100%[=========================================>]   4.70M  3.58MB/s    in 1.3s

2017-05-26 18:21:30 (3.58 MB/s) - ‘rabbitmq-server-3.6.10-1.el6.noarch.rpm’ saved [4931483/4931483]

[ec2-user@ip-172-31-4-69 ~]$

 

Now install the signing key. Go to https://www.rabbitmq.com/install-rpm.html to ensure you use the most recent URL.

 

 

[ec2-user@ip-172-31-4-69 ~]$ sudo rpm --import https://www.rabbitmq.com/rabbitmq-release-signing-key.asc
[ec2-user@ip-172-31-4-69 ~]$

 

Now install the RPM you just downloaded.

 

[ec2-user@ip-172-31-4-69 ~]$ sudo yum -y install rabbitmq-server-3.6.10-1.el6.noarch.rpm
Loaded plugins: priorities, update-motd, upgrade-helper
Examining rabbitmq-server-3.6.10-1.el6.noarch.rpm: rabbitmq-server-3.6.10-1.el6.noarch
Marking rabbitmq-server-3.6.10-1.el6.noarch.rpm to be installed
Resolving Dependencies
amzn-main/latest                                                                       | 2.1 kB  00:00:00
amzn-updates/latest                                                                    | 2.3 kB  00:00:00

...

Installed:
  rabbitmq-server.noarch 0:3.6.10-1.el6

Dependency Installed:
  compat-readline5.x86_64 0:5.2-17.3.amzn1                  socat.x86_64 0:1.7.2.3-1.10.amzn1

Complete!

 

Use chkconfig to start RabbitMQ on system boot. Then, use the service command to start the service. Since Amazon Linux runs sysvinit, we use the “chkconfig” and “service” commands. For systemd operating systems, we would use “systemctl.”

 

[ec2-user@ip-172-31-4-69 ~]$ sudo chkconfig rabbitmq-server on
[ec2-user@ip-172-31-4-69 ~]$ sudo service rabbitmq-server start
Starting rabbitmq-server: SUCCESS
rabbitmq-server.
[ec2-user@ip-172-31-4-69 ~]$

 

Once we have RabbitMQ up and running, we can configure it as needed:

 

[ec2-user@ip-172-31-4-69 ~]$ sudo rabbitmqctl add_user myserver myserver123
Creating user "myserver"
[ec2-user@ip-172-31-4-69 ~]$ sudo rabbitmqctl add_vhost myserver_vhost
Creating vhost "myserver_vhost"
[ec2-user@ip-172-31-4-69 ~]$ sudo rabbitmqctl set_user_tags myserver myserver_tag
Setting tags for user "myserver" to [myserver_tag]
[ec2-user@ip-172-31-4-69 ~]$ sudo rabbitmqctl set_user_tags myserver monitoring
Setting tags for user "myserver" to [monitoring]
[ec2-user@ip-172-31-4-69 ~]$ sudo rabbitmqctl set_permissions -p myserver_vhost myserver ".*" ".*" ".*"
Setting permissions for user "myserver" in vhost "myserver_vhost"
[ec2-user@ip-172-31-4-69 ~]$ sudo rabbitmq-plugins enable rabbitmq_management
The following plugins have been enabled:
  amqp_client
  cowlib
  cowboy
  rabbitmq_web_dispatch
  rabbitmq_management_agent
  rabbitmq_management

Applying plugin configuration to rabbit@ip-172-31-4-69... started 6 plugins.
[ec2-user@ip-172-31-4-69 ~]$ sudo service rabbitmq-server restart
Restarting rabbitmq-server: SUCCESS
rabbitmq-server.
[ec2-user@ip-172-31-4-69 ~]$

 

5. Create a Security Group

To use the service, punch a hole in the EC2 firewall via a custom security group.

First, on the AWS GUI, select EC2 under compute.

 

 

Next,  select Security Groups under NETWORK & SECURITY.

 

Click Create Security Group.

 

 

Edit the name to read rabbit_mq, the TCP port range to 5672 and set the network that can access your new RabbitMQ service.  In the example below, I set it to the address of my RabbitMQ server’s Local Area Network (LAN).

 

 

In the EC2 console, click your rabbit_mq server, click Actions, click Networking and then Change Security Groups.

 

 

Attach the rabbit_mq security group.  If you don’t see the security group, ensure you configured the correct VPC when you created the security group.

 

You now have a dedicated RabbitMQ service. Now you are ready to try a simple “hello world” program.

Connect AWS Lambda to Elasticsearch

Amazon Web Services’ (AWS) Lambda provides a serverless architecture framework for your web applications.  You deploy your application to Lambda, attach an API Gateway and then call your new service from anywhere on the web.  Amazon takes care of all the tedious, boring and necessary housekeeping.

In this HOWTO I show you how to create a proxy in front of the AWS Elasticsearch service using a Lambda function and an API Gateway.  We use Identity and Access Management  (IAM) policies to sign and encrypt the communication between your Lambda function and  the Elasticsearch service.  This HOWTO serves as a simple starting point.  Once you successfully jump through the hoops to connect Lambda to Elasticsearch, you can easily grow your application to accommodate new features and services.

The agenda for this HOWTO follows:

  1. Deploy and configure an AWS Elasticsearch endpoint
  2. Configure your Chalice development environment
  3. Create an app that proxies/ protects your Elasticsearch endpoint
  4. Configure an IAM policy for your Lambda function
  5. Use Chalice to deploy your Lambda function and create/ attach an API gateway
  6. Test drive your new Lambda function

1. Deploy an AWS Elasticsearch Instance

Amazon makes Elasticsearch deployment a snap.  Just click the Elasticsearch Service icon on your management screen:

 

 

If you see the “Get Started” screen, click “Get Started.”

 

 

Or, if you’ve used the Elasticsearch service before and see the option for “New Domain,” click “New Domain.”

 

 

Name your domain “test-domain” (Or whatever).

 

 

Keep the defaults on the next screen “Step 2: Configure Cluster.”  Just click “next.”   On the next screen, select: “Allow or deny access to one or more AWS accounts or IAM users”.  

 

 

Amazon makes security easy as well.  On the next menu they list your ARN.  Just copy and paste it into the text field and hit “next.”

 

 

AWS generates the JSON for your Elasticsearch service:

 

 

Click “Next” and then “confirm and create.”

Expect about ten (10) minutes for the service to initiate.  While you wait for the service to deploy, you should set up your Chalice development environment.

 

2. Configure your Chalice development environment

 

As a convenience, I summarize the instructions from the authoritative Chalice HOWTO here.

First, create a Python virtual environment for a development

 

[ec2-user@ip-172-31-4-69 ~]$ virtualenv chalice-demo
New python executable in chalice-demo/bin/python2.7
Also creating executable in chalice-demo/bin/python
Installing setuptools, pip...done.

 

Change directories to your new sandbox and then activate the virtual environment.

 

[ec2-user@ip-172-31-4-69 ~]$ cd chalice-demo/
[ec2-user@ip-172-31-4-69 chalice-demo]$ . bin/activate

 

Now upgrade pip.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 chalice-demo]$ pip install -U pip
You are using pip version 6.0.8, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting pip from https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
  Using cached pip-9.0.1-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 6.0.8
    Uninstalling pip-6.0.8:
      Successfully uninstalled pip-6.0.8

Successfully installed pip-9.0.1
(chalice-demo)[ec2-user@ip-172-31-4-69 chalice-demo]$

 

Finally, install Chalice.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 chalice-demo]$ pip install chalice
Collecting chalice
  Downloading chalice-0.8.0.tar.gz (86kB)
    100% |████████████████████████████████| 92kB 6.6MB/s 
Collecting click==6.6 (from chalice)
  Downloading click-6.6-py2.py3-none-any.whl (71kB)
    100% |████████████████████████████████| 71kB 6.9MB/s 
Collecting botocore<2.0.0,>=1.5.0 (from chalice)
  Downloading botocore-1.5.45-py2.py3-none-any.whl (3.4MB)
    100% |████████████████████████████████| 3.5MB 335kB/s 
Collecting virtualenv<16.0.0,>=15.0.0 (from chalice)
  Downloading virtualenv-15.1.0-py2.py3-none-any.whl (1.8MB)
    100% |████████████████████████████████| 1.8MB 648kB/s 
Collecting typing==3.5.3.0 (from chalice)
  Downloading typing-3.5.3.0.tar.gz (60kB)
    100% |████████████████████████████████| 61kB 9.3MB/s 
Collecting six<2.0.0,>=1.10.0 (from chalice)
  Downloading six-1.10.0-py2.py3-none-any.whl
Collecting jmespath<1.0.0,>=0.7.1 (from botocore<2.0.0,>=1.5.0->chalice)
  Downloading jmespath-0.9.2-py2.py3-none-any.whl
Collecting docutils>=0.10 (from botocore<2.0.0,>=1.5.0->chalice)
  Downloading docutils-0.13.1-py2-none-any.whl (537kB)
    100% |████████████████████████████████| 542kB 2.2MB/s 
Collecting python-dateutil<3.0.0,>=2.1 (from botocore<2.0.0,>=1.5.0->chalice)
  Downloading python_dateutil-2.6.0-py2.py3-none-any.whl (194kB)
    100% |████████████████████████████████| 194kB 5.7MB/s 
Installing collected packages: click, jmespath, docutils, six, python-dateutil, botocore, virtualenv, typing, chalice
  Running setup.py install for typing ... done
  Running setup.py install for chalice ... done
Successfully installed botocore-1.5.45 chalice-0.8.0 click-6.6 docutils-0.13.1 jmespath-0.9.2 python-dateutil-2.6.0 six-1.10.0 typing-3.5.3.0 virtualenv-15.1.0
(chalice-demo)[ec2-user@ip-172-31-4-69 chalice-demo]$ 

 

The quickstart is pretty clear about how to configure credentials.  Here are their instructions verbatim…

Before you can deploy an application, be sure you have credentials configured. If you have previously configured your machine to run boto3 (the AWS SDK for Python) or the AWS CLI then you can skip this section.

If this is your first time configuring credentials for AWS you can follow these steps to quickly get started:

$ mkdir ~/.aws
$ cat >> ~/.aws/config
[default]
aws_access_key_id=YOUR_ACCESS_KEY_HERE
aws_secret_access_key=YOUR_SECRET_ACCESS_KEY
region=YOUR_REGION (such as us-west-2, us-west-1, etc)

If you want more information on all the supported methods for configuring credentials, see the boto3 docs.

 

From the chalice-demo directory, create a new Chalice project.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 chalice-demo]$ chalice new-project eslambda

 

You have set up your development environment.

 

3.  Create an app that proxies/ protects your Elasticsearch endpoint

 

At this point, your Elasticsearch endpoint should be up and running.  Copy the fully qualified domain name (FQDN) for your new endpoint.  You will copy this FQDN into the application below.

 

 

The following application uses the boto library to access an authorized IAM role to sign and encrypt calls to  your Elasticsearch endpoint.  Be sure to configure the host parameter with your Endpoint address.

 

 

Change directories to the new eslambda project.  You will see two automatically created documents:  app.py and requirements.txt

 

(chalice-demo)[ec2-user@ip-172-31-4-69 chalice-demo]$ cd eslambda/
(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ ls
app.py  requirements.txt
(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$

 

Overwrite app.py with the app.py code above.  Then, pip install boto.  Use the pip freeze | grep boto command to populate requirements.txt with the proper version of boto.  requirements.txt tells Lambda which Python packages to install.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ pip install boto
Collecting boto
  Downloading boto-2.46.1-py2.py3-none-any.whl (1.4MB)
    100% |████████████████████████████████| 1.4MB 851kB/s 
Installing collected packages: boto
Successfully installed boto-2.46.1
(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ pip freeze | grep boto >> requirements.txt 

4. Configure an IAM policy for your Lambda function

 

Create a document called policy.json in the hidden .chalice directory and add the following JSON. This will let Lambda use the Elasticsearch service.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ vim .chalice/policy.json

 

 

5. Use Chalice to deploy your Lambda function and create/ attach an API gateway

 

Cross your fingers, this should work.  Deploy your Chalice application with the following command.  Take note of the endpoint that Chalice returns.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ chalice deploy
Initial creation of lambda function.
Creating role
Creating deployment package.
Initiating first time deployment...
Deploying to: dev
https://keqpeva3wi.execute-api.us-east-1.amazonaws.com/dev/
(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ 

6. Test drive your new Lambda function

 

Enter the URL of the service endpoint in your browser.  In my case, I will go to https://keqpeva3wi.execute-api.us-east-1.amazonaws.com/dev/

 

 

Yes.  For some reason the steps on the Chalice quick start does not seem to work.  If you take a look at policy.json you’ll see that Chalice over-wrote it.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ cat .chalice/policy.json 
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "arn:aws:logs:*:*:*",
      "Effect": "Allow"
    }
  ]
}(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$

 

Chalice created a policy to allow our Lambda function to log.  Let’s keep that action and add the Elasticsearch verbs.  Edit .chalice/policy.json once more, this time using the enriched JSON encoded policy.

 

 

Redeploy again, this time turn off the auto policy generation.

 

(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$ chalice deploy --no-autogen-policy
Updating IAM policy.
Updating lambda function...
Regen deployment package...
Sending changes to lambda.
API Gateway rest API already found.
Deploying to: dev
https://keqpeva3wi.execute-api.us-east-1.amazonaws.com/dev/
(chalice-demo)[ec2-user@ip-172-31-4-69 eslambda]$

 

It may take a few minutes for the new Lambda function to bake in.  Be sure to hit Control+F5 to make sure you’re not hitting a cached version of your new application.  Alternatively, you can pip install httpie.

From the command line, use httpie to access your new proxy.

 

 

Congratulations!  Your Lambda function can hit your Elasticsearch service!

Add @Timestamp to your Python Elasticsearch DSL Model

The Python Elasticsearch Domain Specific Language (DSL) lets you create models via Python objects.

Take a look at the model Elastic creates in their persistence example.

 

#!/usr/bin/env python
# persist.py
from datetime import datetime
from elasticsearch_dsl import DocType, Date, Integer, Keyword, Text
from elasticsearch_dsl.connections import connections

class Article(DocType):
    title = Text(analyzer='snowball', fields={'raw': Keyword()})
    body = Text(analyzer='snowball')
    tags = Keyword()
    published_from = Date()
    lines = Integer()

    class Meta:
        index = 'blog'

    def save(self, ** kwargs):
        self.lines = len(self.body.split())
        return super(Article, self).save(** kwargs)

    def is_published(self):
        return datetime.now() < self.published_from

if __name__ == "__main__":
    connections.create_connection(hosts=['localhost'])
    # create the mappings in elasticsearch
    Article.init()

 

I wrapped their example in a script and named it persist.py.  To initiate the model, execute persist.py from the command line.

 

$ chmod +x persist.py
$ ./persist.py

 

We can take a look at these mappings via the _mapping API. In the model, Elastic names the index blog. Use blog, therefore, when you send the request to the API.

 

$ curl -XGET 'http://localhost:9200/blog/_mapping?pretty'

 

The save() method of the Article object generated the following automatic mapping (schema).

 

{
  "blog" : {
    "mappings" : {
      "article" : {
        "properties" : {
          "body" : {
            "type" : "text",
            "analyzer" : "snowball"
          },
          "lines" : {
            "type" : "integer"
          },
          "published_from" : {
            "type" : "date"
          },
          "tags" : {
            "type" : "keyword"
          },
          "title" : {
            "type" : "text",
            "fields" : {
              "raw" : {
                "type" : "keyword"
              }
            },
            "analyzer" : "snowball"
          }
        }
      }
    }
  }
}

 

That’s pretty neat! The DSL creates the mapping (schema) for you, with the right Types. Now that we have the model and mapping in place, use the Elastic provided example to create a document.

 

#!/usr/bin/env python

# create_doc.py
from datetime import datetime
from persist import Article
from elasticsearch_dsl.connections import connections

# Define a default Elasticsearch client
connections.create_connection(hosts=['localhost'])

# create and save and article
article = Article(meta={'id': 42}, title='Hello world!', tags=['test'])
article.body = ''' looong text '''
article.published_from = datetime.now()
article.save()

 

Again, I wrapped their code in a script.  Run the script.

 

$ chmod +x create_doc.py
$ ./create_doc.py

 

If you look at the mapping, you see the published_from field maps to a Date type. To see this in Kibana, go to Management –> Index Patterns as shown below.

 

 

Now type blog (the name of the index from the model) into the Index Name or Pattern box.

 

 

From here, you can select published_from as the time-field name.

 

 

If you go to Discover, you will see your blog post.

 

 

Logstash, however, uses @timestamp for the time-field name. It would be nice to use the standard name instead of a one-off, custom name. To use @timestamp, we must first update the model.

In persist.py (above), change the save stanza from…

 

def save(self, ** kwargs):
        self.lines = len(self.body.split())
        return super(Article, self).save(** kwargs)

 

to…

 

def save(self, ** kwargs):
        self.lines = len(self.body.split())
        self['@timestamp'] = datetime.now()
        return super(Article, self).save(** kwargs)

 

It took me a ton of trial and error to finally realize we need to update @timestamp as a dictionary key. I just shared the special sauce recipe with you, so, you’re welcome! Once you update the model, run create_doc.py (above) again.

 

$ ./create_doc.py

 

Then, go back to Kibana –> Management –> Index Patterns and delete the old blog pattern.

 

 

When you re-create the index pattern, you will now have a pull down for @timestamp.

 

 

Now go to discover and you will see the @timestamp field in your blog post.

 

 

You can go back to the _mapping API to see the new mapping for @timestamp.

 

$ curl -XGET 'http://localhost:9200/blog/_mapping?pretty'

 

This command returns the JSON encoded mapping.

 

{
  "blog" : {
    "mappings" : {
      "article" : {
        "properties" : {
          "@timestamp" : {
            "type" : "date"
          },
          "body" : {
            "type" : "text",
            "analyzer" : "snowball"
          },
          "lines" : {
            "type" : "integer"
          },
          "published_from" : {
            "type" : "date"
          },
          "tags" : {
            "type" : "keyword"
          },
          "title" : {
            "type" : "text",
            "fields" : {
              "raw" : {
                "type" : "keyword"
              }
            },
            "analyzer" : "snowball"
          }
        }
      }
    }
  }
}

 

Unfortunately, we still may have a problem. If you notice, @timestamp here is in the form of “April 1st 2017, 19:28:47.842.” If you’re sending a Document to an existing Logstash doc store, it most likely will have the default @timestamp format.

To accomodate the default @timestamp format (or any custom format), you can update the model’s save stanza with a string format time command.

 

def save(self, ** kwargs):
        self.lines = len(self.body.split())
        t = datetime.now()
        self['@timestamp'] = t.strftime('%Y-%m-%dT%H:%M:%S.%fZ')
        return super(Article, self).save(** kwargs)

 

You can see the change in Kibana as well (view the raw JSON).

 

 

That’s it!  The more you use the Python Elasticsearch DSL, the more you will love it.

Pass Bootstrap HTML attributes to Flask-WTForms

Flask-WTForms helps us create and use web forms with simple Python models. WTForms takes care of the tedious, boring and necessary security required when we want to use data submitted to our web app via a user on the Internet. WTForms makes data validation and Cross Sight Forgery Request (CSFR) avoidane a breeze. Out of the box, however, WTForms creates ugly forms with ugly validation. Flask-Bootstrap provides a professional layer of polish to our forms, with shading, highlights and pop ups.

Flask-Bootstrap also provides a “quick_form” method, which commands Jinja2 to render an entire web page based on our form model with one line of code.

In the real world, unfortunately, customers have strong opinions about their web pages, and may ask you to tweak the default appearance that “quick_form” generates. This blog post shows you how to do that.

In this blog post you will:

  • Deploy a web app with a working form, to include validation and polish
  • Tweak the appearance of the web page using a Flask-WTF macro
  • Tweak the appearance of the web page using a Flask-Bootstrap method

The Baseline App

The following code shows the baselined application, which uses “quick_form” to render the form’s web page. Keep in mind that this application doesn’t do anything, although you can easily extend it to persist data using an ORM (for example). I based the web app on the following Architecture:

 

 

The web app contains models.py (contains form model), take_quiz_template.html (renders the web page) and application.py (the web app that can route to functions based on URL and parse the form data).

[ec2-user@ip-192-168-10-134 ~]$ tree flask_bootstrap/
flask_bootstrap/
├── application.py
├── models.py
├── requirements.txt
└── templates
    └── take_quiz_template.html

1 directory, 4 files
[ec2-user@ip-192-168-10-134 ~]$ 

Install the three files into your directory. As seen in the tree picture above, be sure to create a directory named templates for take_quiz_template.html.

Create and activate your virtual environment and then install the required libraries.

[ec2-user@ip-192-168-10-134 ~]$ virtualenv flask_bootstrap/
New python executable in flask_bootstrap/bin/python2.7
Also creating executable in flask_bootstrap/bin/python
Installing setuptools, pip...done.
[ec2-user@ip-192-168-10-134 ~]$ . flask_bootstrap/bin/activate
(flask_bootstrap)[ec2-user@ip-192-168-10-134 ~]$ pip install -r flask_bootstrap/requirements.txt

  ...

Successfully installed Flask-0.11.1 Flask-Bootstrap-3.3.7.0 Flask-WTF-0.13.1 Jinja2-2.8 MarkupSafe-0.23 WTForms-2.1 Werkzeug-0.11.11 click-6.6 dominate-2.3.1 itsdangerous-0.24 visitor-0.1.3
(flask_bootstrap)[ec2-user@ip-192-168-10-134 ~]$ 

Start your flask application and then navigate to your IP address. Since this is just a dev application, you will need to access port 5000.

(flask_bootstrap)[ec2-user@ip-192-168-10-134 ~]$ cd flask_bootstrap/
(flask_bootstrap)[ec2-user@ip-192-168-10-134 flask_bootstrap]$ ./application.py 
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger pin code: 417-431-486

This application uses the quick_form method to generate a web page. Note that the application includes all sorts of goodies, such as CSFR avoidance, professional looking highlights and validation. Play around with the page to look at the different validation pop-ups and warnings.

Now imagine that your customer wants to change the look of the submit button, or add some default text. In this situation, the quick_form does not suffice.

Attempt 1: Use a Flask-WTF Macro

The Flask-WTF docs include a Macro named render_field which allows us to pass HTML attributes to Jinja2. We save this macro in a file named _formhelpers.html and stick it in the same templates folder as take_quiz_template.html.

{% macro render_field(field) %}
  <dt>{{ field.label }}
  <dd>{{ field(**kwargs)|safe }}
  {% if field.errors %}
    <ul class=errors>
    {% for error in field.errors %}
      <li>{{ error }}</li>
    {% endfor %}
    </ul>
  {% endif %}
  </dd>
{% endmacro %}

Now, update the take_quiz_template.html template to use the new macro. Note that we lose the quick_form shortcut and need to spell out each form field.

When you go to your web page you will see the default text we added to the field:

{{ render_field(form.essay_question, class='form-control', placeholder='Write down your thoughts here...') }}

And an orange submit button that spans the width of the page:

{{ render_field(form.submit, class='btn btn-warning btn-block') }}

You can see both of these changes on the web page:

Unfortunately, if you click submit without entering any text, you will notice that we have reverted to ugly validations.

Attempt 2: Use Flask-Bootstrap

Although pretty much hidden in the Flask-Bootstrap documents, it turns out you can add extra HTML elements directly to the template engine using form_field.

As before, we add default text with “placeholder:”

{{ wtf.form_field(form.essay_question, class='form-control', placeholder='Write down your thoughts here...') }}
{{ wtf.form_field(form.email_addr, class='form-control', placeholder='your@email.com') }}

We then customize the submit button. You can customize the button however you would like. Take a look here for more ideas.

{{ wtf.form_field(form.submit, class='btn btn-warning btn-block') }}

This gives us a bootstrap rendered page with pretty validation:

As you can see, we get a popup if we attempt to submit without entering text.

Conclusion

You now have a working web application that easily renders professional looking forms with validation and pop-ups. In the future you can trade ease of deployment against customability.

Reliable Multicast at Internet Scale (Part 4): The Gotchas

Freshlex LLC (should) architect the reliable multicast infrastructure for the putative John Carmack biopic, which will hit the Internet in December of 2018. The first blog post discusses two of the enabling technologies, FCAST and ALC. The second blog post discusses the remaining three technologies, LCT, WEBRC and the FEC building block. The third blog post discusses an Architecture that integrates the five technologies. This final architecture discusses possible integration issues and challenges.

WEBRC Integration Issues

Our FCAST/ALC architecture runs over a multicast IP network. An issue arises when the multicast IP network uses RFC112 Any Source Multicast (ASM). The WEBRC building block uses multicast round trip time (MRTT) and packet loss to compute a target reception rate. The WEBRC receiver uses this target reception rate for congestion control. ASM skews MRTT and packet loss, and thus gives receivers an erroneous target reception rate.

The WEBRC receiver computes MRTT as the time it takes the receiver to receive the first data packet after sending a join request to a channel. ASM, however, initiates multicast using rendezvous points (RP). All transmitters send their data packets to an RP (decided a priori by network engineers) that may be far away. The receivers send a join to this RP. Once data packets begin to flow to the receivers, the routers switch to a shortest path tree (SPT), finding the shortest path from the transmitter to the receiver, which does not need to include the RP. [RFC5775 6]

The following (see diagram) illustrates a scenario where switching from an RP to SPT skews the WEBRC receiver MRTT computation (and therefore target reception rate). We use ASM, so “any source” transmits to the multicast address. TX-A and TX-B both have data to multicast. They transmit to the rendezvous point. The RX has no idea who is sending, they just want to join the multicast, so they send a join to the rendezvous point. The RP is three hops away. Lets say for illustrative purposes each hop adds 10ms delay. The join takes 30ms to reach the RP and then the first data packets from the multicasts for TX-A and TX-B take 30ms each.

Thus, the RX computes the MRTT for TX-A as 60ms, and the MRTT for TX-B as 60ms. At this point the multicast enabled routers switch to shortest path tree. The multicast from TX-A to the RX now only takes one hop, so the MRTT would be 2 * 10ms or 20ms. The multicast from TXB to the RX is now four hops, so the actual MRTT should be 80ms. Thus, as a result of ASM, the RX sets the target reception rate for TX-A as 66% too low, and the target reception rate for TX-B as 33% too high.

The “saving grace” in the case for TX-B would be the dropped packets, since 1/3 would drop and the RX would change the target reception rate accordingly. WEBRC, however, adjusts rates at points in time that are separated by seconds. In addition, if we lost packets during the switch over from RP to SPT then the RX would have incorrect parameters for packet loss (based on receiving or not receiving monotonically increasing sequence numbers), which would skew the target reception rate. The solution to this issue is to use SSM multicast, which does not use RP. If we must use ASM, then have one RP (and thus multicast address) per sender and put the RP as close to the sender as possible (i.e. on the first hop router at the demarc). [RFC5775 6]

Another design issue with WEBRC deals with setting the appropriate wave channel rates. We need to set the base rate to the lowest common denominator, so that all users can subscribe to it. The main purpose of the base channel is to communicate timing information (CTSI) and wave channel rates to the receivers so they can sync their joins to wave channel periods and join enough channels to reach their target rates (RFC3738 8). We need to, however find the right balance for the wave channel data rates. We need to balance granularity against number of multicast channels. If we had a video stream at OC-192 rates, would it make sense to have 3.75e+4 channels? Would the joins flood the NW? It would make sense to tune the channel rates to the expected use case. If 99% of the users have the same capacity, then we can be coarse. If the bell curve of capacity is low and wide, then we need to be more granular. The only way to find the optimal channel rates is through off line analysis, either using mathematical analysis (Bertsekas, Kleinrock, Jackson etc.) or a discrete event simulation (DES) such as Riverbed SteelCentrall NetPlanner. Off line analysis, however, requires user profiles, use cases and real life network metrics.

The following diagram illustrates a poor design choice. We have three channels, the base channel is set to T1, wave 1 is an OC-12 and wave 2 is an OC-192. A receiver with an OC-12 does not have enough capacity to join the base and wave 1, so he is stuck with just the T1 rate, a very poor efficiency.

The final issue for WEBRC deals with the length of periods for joins. We need to balance the join/leave times against available BW fluctuations. For example, if a receiver joins a channel and the BW drops significantly, the receiver can’t leave that channel until the next time slot (RFC3738 13). For the duration of the time slot, the traffic congestion may choke other congestion protocols (like TCP). The RFC recommends 10s/period (RFC3738 9). Since our data rate is constant, the receivers should not have any surprises, and this period duration should suffice. This however is still an issue and needs to be observed and addressed during live transmissions.

LCT Integration Issues

LCT provides a convenient mechanism for setting the mandatory transport session ID. As per the RFC, we have the option of using the 16-bit UDP port field to carry the TSI (RFC5651 9). I would recommend against this, since we cannot guarantee that downstream receivers would use some sort of port address translation or firewalling. Since the LCT header is mandatory and contains a field for the TSI, it’s best to just set the TSI there.

FCAST Integration Issues

To recap, FCAST uses sessions to send objects to receivers. FCAST sends objects by creating a carousel instance, filling the carousel instance with objects, and then using a carousel instance object to let receivers know which objects the carousel instance carries (RFC6968 8).

FCAST uses one session per sender, and in each session each object must have a unique Transport Object Identifier. Our integration engineers need to be aware of the potential for TOI wrapping, for long-lived sessions (RFC6968 10). FCAST gets the TOI from the LCT header.

The LCT RFC allows a finite number of bits in the LCT header for TOI (RFC5651 17). Thus, for long-lived sessions (days, weeks), TOI wrap and present ambiguity to receivers, similar to the issue of byte sequence number wrapping for TCP. A receiver may receive two separate objects with the same TOI in the course of a long-lived session. With “on demand” mode, carousels cycle through the same pieces of data a set number of times. Consider a large carousel instance, where FCAST sends an object with TOI “1”, followed by enough objects to wrap the TOI.

During the first cycle, the CI sends another, newer object with TOI “1.” The cycle finishes, and FCAST starts the cycle again, sending the original object with TOI “1.” The receiver has no idea what to do with the object of TOI “l,” since it alternates as a reference for two distinct objects. A way to prevent this issue is, once FCAST reaches the halfway point of sequence numbers, it resends any old data with a new TOI. Another way to prevent TOI wrap ambiguity is to have metadata associated with TOI, so the receiver can distinguish between two objects with the same TOI. [RFC6968 10-11]

Another integration issue relates to the Carousel Instance Object (CIO). As mentioned in the above paragraph, the CIO carries the list of the compound objects that comprise a Carouse1 Instance, and specifies the respective Transmission Object Identifiers (TOI) for the objects. The CIO contains a “complete” flag that informs the receiver that the CI will not change in the future (i.e. FCAST guarantees the sender will not add, remove or modify any objects in the current carousel instance). Consider a receiver that receives a CIO with a “complete” flag. We may be tempted to use the list of Compound Object TOI as a means to filter incoming data. The issue, however, is that FCAST treats the CIO (list of objects) as any other object, that is, there are no reserved TOI that designate an object as a CIO. Thus, the receiver will never know in advance the TOI of CIO, so the RFC recommends that the receivers do not filter based on TOI. If a receiver were to filter out all but the TOI received in CIO with a “complete” flag, that receiver would also filter out any new CIO for new carousel instances associated with the session, and the receiver may miss out on interesting objects. [RFC6968 9]

FCAST, finally, allows integrators to send an empty CIO during idle times. The empty CIO lets RX know all previous objects have been removed, and can be used as a heartbeat mechanism. [RFC6968 9-10]

Conclusion

Real time video content delivery systems are a challenge due to the constant, high data rates involved. RT Video CDP lend themselves to circuit switched networks that can reserve enough bandwidth from sender to receiver and let the data fly. The next best architecture would be packet switched networks that were designed for multimedia, such as the Integrated Services Data Network (ISDN) or ATM. Our customer, unfortunately, required us to deploy a CDP on the Internet. To make matters worse, they required our CDP to handle millions of simultaneous receivers. Normally, when delivering constant rate video on the Internet, engineers will use a signaling technology such as resource reservation protocol (RSVP) to guarantee bandwidth from sender to receiver. An end to end (E2E) reservation scheme, however, does not lend itself well to a system with one sender and millions of receivers.

Our challenge, therefore, was to identify and deploy a one to many, massively scalable CDP that provides asynchronous, reliable and fair, multi-rate streaming data transport. We identified a solution based on IETF standards. This paper went through the solution intent, the technologies used, the integration choices made, the integration issues avoided and the validation steps performed to ensure the success.

Update: January 2019:

In the end, we successfully integrated the architecture and met all of Warner Brothers’ requirements. Averaged over the course of the movie, the average receiver ran at 90% of the available network capacity, and utilized 87% of processor resources. The bit error rate for the average receiver was 1e-13. Finally, both Anthony Michael Hall and Dwayne ‘The Rock” Johnson went on to win academy awards for best actor and best supporting actor.

Bibliography

[RFC3453] Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, H. and J. Crowcroft, “The Use of Forward Error Correction (FEC) in Reliable Multicast”, RFC 3453 December 2002.

[RFC3738] Luby, M. and V. Goyal, “Wave and Equation Based Rate Control (WEBRC) Building Block”, RFC 3738, April 2004.

[RFC5651] Luby, M., Watson, M. and L. Vicisano, “Layered Coding Transport (LCT) Building Block”, RFC 5651, October 2009.

[RFC5775] Luby, P., Watson, P. and L. Vicisano, “Asynchronous Layered Coding (ALC) Protocol Instantiation”, RFC 5775, April 2010.

[RFC6968] Roca, V. and B. Adamson, “FCAST: Scalable Object Delivery for the ALC and NORM Protocols”, RFC 6968, July 2013.

Reliable Multicast at Internet Scale (Part 3): The Architecture

Freshlex LLC (should) architect the reliable multicast infrastructure for the putative John Carmack biopic, which will hit the Internet in December of 2018. The first blog post discusses two of the enabling technologies, FCAST and ALC. The second blog post discusses the three technologies that enable ALC: LCT, WEBRC and the FEC building block. This blog post discusses an Architecture that integrates the five technologies.

Integration Choices

Massive scalability drives this integration effort. For the content delivery platform (CDP), we define scalability as the behavior of the CDP in relation to the number of receivers and network paths, their heterogeneity and the ability to accommodate dynamically variable sets of receivers. In general, three factors limit the scalability of a CDP. The three factors that limit the scalability of a CDP are (1) memory or processing requirements, (2) amount of feedback control and (3) redundant data traffic (RFC5651 5). The previous blog posts describe the standards used to create a massively scalable CDP. Standards, however, are not “turn-key” solutions. Engineers must make certain design choices when implementing standards. This blog post discusses the design choices made during this integration effort in order to conform to the spirit of massive scalability.

FCAST Integration Choices

Recall that FCAST uses data carousels to send objects to receivers. We have two design choices here, push mode or on demand mode (RFC6968 3). Push mode associates a single carousel instance to a cycle (RFC6968 8). On-demand mode makes compound objects available for a long period of time by using a very large number of transmission cycles. On-demand mode lends itself well to data transport, such as a software updates. A sender could have a carousel cycle for days. Clients then join the session at their leisure and leave once they receive the entire update. Push mode works better for (near) real time streaming video. The clients join at any time, but they will miss any video that occurs before their join. The integrator would need to design how to best implement this push mode. They could, for example, have one carousel instance per hour of video, with l2 minute chunks of data being an object. The carousel instance object lists the transport object ID of the five compound objects, and sets the complete flag, indicating the carousel object has a finite set of compound objects.

Integrators have many options in increasing the reliability of FCAST. For example, when using on-demand mode, an integrator can set the number of cycles to repeat for a period of time that exceeds the typical download time. In this case, you can correlate number of cycles with reliability (RFC3453 2-3). An integrator can use a backchannel for session control, for example the carousel does not stop cycling until every receiver acknowledges full receipt. In this case, FCAST is fully reliable. Of course, the concept of a backchannel is unacceptable for massive scalability.

In our integration, since we’re using push mode we don’t have the luxury of repeating cycles for reliability. For that reason we use a robust FEC building block, which is a requirement of ALC anyway (RFC5775 11).

ALC Integration Choices

The ALC standard omits application specific features to keep it massively scalable (RFC5775 5-6). An integrator can tailor the applications (e.g. FCAST) that use ALC to add features and trade scalability if needed. The backchannel mentioned above in the discussion of FCAST design choices is one such example.

The first step of an ALC session entails the receiver acquiring the session description information (RFC5775 17). The transmission of the session description information from ALC sender to the receivers is outside of the scope of the ALC standard. An integrator, regardless, has many options. The sender can describe the session description using SDP as described in RFC4566 or XML metadata in RFC3023 or HTTP/MIME headers defined in RFC2616. The sender, alternatively, can carry the session description in a Session Announcement Protocol (SAP) as per RFC2974.

We will simply have a well-known web page with session description information. When an RX wants to join a session, they go to that web page and download the session description (RFC5651 24).

FEC Integration Choices

The main unresolved question for the FEC building block pertains to the use of in-band or out-of-band channels to communicate FEC meta-data to the RX (RFC5775 11). Put another way, how does a receiver decode the following encoded message from a sender: “I’ve encoded this message using this scheme.” The previous statement is a paradox– the receiver would not be able to decode the message unless they decoded the message to obtain the correct way to decode the message. We solve this problem by providing both the FCAST transmitter and FCAST receiver software to all parties. In order to receive the streaming video, a receiver must use our player. The sender and receiver software use one FEC scheme, LDPC Staircase and Triangle FEC, as described in RFC5170. We will throw the open source zealots a bone and point them to the RFC, if they wish to build their own receiver.

The next design choice deals with the application of FEC codes. For our data carousel we chose a large systemic code from RFC5170. A FEC Data carousel using large block FEC encoder considers all k source symbols of an object as one block and produces n encoding symbols. The carousel transmits the n encoding symbols in packets in the same order in each round. A receiver joins the transmission at any point, and as long as the receiver receives at least k encoding symbols during the transmission of the next n encoding symbols the receiver can completely recover the object (RFC3453 3).

In the case of our push mode carousel, we partition our stream into objects. The FEC building block turns these objects into source symbols. The FEC building block then encodes these source symbols into encoding symbols and then the sets of the encoding symbols for each object are transmitted to each receiver.

Ideally, the FEC building block creates, encodes and transmits the source blocks in such a way that each received multicast packet is fully useful to reassemble the object independent of previous packet reception. Thus, if some packets are lost in transit between the TX & RX, the receiver uses any subsequent equal number of packets that arrive to reassemble the object (RFC3453 4). We prefer this to the alternatives, such as asking the transmitter for the missed packets (ARQ) or waiting on the carousel to re-send the desired packets (which won’t happen, since we’re in push mode and thus have one cycle per carousel). This property reduces the problems associated with push mode data carousels (RFC3453 3).

WEBRC Integration Choices

The appropriate congestion control for content bulk data transfer differs from the appropriate congestion control for streaming video. For bulk data transfer, the intent is to use all available BW and then drastically back off when there is competing traffic. Streaming delivery applications prefer a lesser, constant rate to bursty peaks, with slight or no backoff.

From the RFC, engineers tuned WEBRC to work best in situations that have a low throughput variation over time, which makes it well suited to telephony or our streaming video where a smooth rate is important. The penalty for smoother throughput, however, is that WEBRC responds more slowly (compared with TCP) to changes in available BW. [RFC3738 4]

Another reason that we use WEBRC for our streaming video application is that WEBRC was designed for applications that use fixed packet size and vary their packet reception rates in response to congestion. In general, WEBRC was designed to be reasonably fair when competing for BW with TCP flows, that is, it’s within a factor or two of the expected RX rate if TCP were used [RFC3738 4].

By default, WEBRC avoids using techniques that are not massively scalable. For example, WEBRC does not provide any mechanisms for sending information from receivers to senders, although this does not rule out protocols that both use WEBRC and that send information from receivers to senders. For massive scalability, nonetheless, we have made the integration choice not to use any backchannels. [RFC3738 1]

LCT Integration Choices

Part of the integration effort relies on how to get data objects to the LCT building block. Consider a push model, where we want to push a 50MB file via a carousel. We need to choose how to get those data into LCT. Suppose we break the file into 1KB packets. Then, if we send 50pkts/sec to one channel, it takes each RX 1,000 sec to get the file. A better implementation would be to split the file into multiple layers so that the aggregate rate is 1,000 packets/second.

With no loss, an RX now can complete the file download in 50 seconds by subscribing to all channels. Each channel, however, requires us to register a new multicast IP address with the multicast NW.

We could configure the sender to include Expected Residual Time (ERT) in the packet header extension (RFC5651 22). The ERT indicates the expected remaining time of packet transmission for either the single object carried in the session or for the object identified by the Transmission Object Identifier (TOI) if there are multiple objects carried in the session. While useful for “on- demand” mode, we don’t need to configure this for our push mode. The data we push is one time, “take it or leave it.” The ERT only applies when we send the same object out for multiple cycles. With the “one cycle per carousel” push mode, the ERT field does not provide any useful information (RFC6968 8).

Conclusion

This blog post discusses an Architecture that integrates the five enabling reliable multicast technologies. The next and final blog post discusses integration challenges.

Bibliography

[RFC3453] Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, H. and J. Crowcroft, “The Use of Forward Error Correction (FEC) in Reliable Multicast”, RFC 3453 December 2002.

[RFC3738] Luby, M. and V. Goyal, “Wave and Equation Based Rate Control (WEBRC) Building Block”, RFC 3738, April 2004.

[RFC5651] Luby, M., Watson, M. and L. Vicisano, “Layered Coding Transport (LCT) Building Block”, RFC 5651, October 2009.

[RFC5775] Luby, P., Watson, P. and L. Vicisano, “Asynchronous Layered Coding (ALC) Protocol Instantiation”, RFC 5775, April 2010.

[RFC5170] Roca, V., Neumann, C., and D. Furodet, “Low Density Parity Check (LDPC) Staircase and Triangle Forward Error Correction (FEC) Schemes”, RFC 5170, June 2008.

[RFC6968] Roca, V. and B. Adamson, “FCAST: Scalable Object Delivery for the ALC and NORM Protocols”, RFC 6968, July 2013.

Reliable Multicast at Internet Scale (Part 2): LCT, WEBRC and FEC

Freshlex LLC (should) architect the reliable multicast infrastructure for the putative John Carmack biopic, which will hit the internet in December of 2018. The first blog post discusses two of the enabling technologies, FCAST and ALC. This blog post discusses the three technologies that enable ALC: LCT, WEBRC and the FEC building block.

Reliable Multicast Building Block: LCT

LCT Description Since the 70’s engineers have for the most part associated the transport layer of the Open Systems Interconnect (OSI) protocol stack with either Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP). More recently, we have real time protocol (RTP) as a session layer for real time media. This blog post introduces a new transport layer protocol, layered coding transport (LCT).

LCT acts as a building block for ALC. LCT provides a transport layer service that, in concert with FEC and WEBRC allows ALC to be a massively scalable and reliable content stream delivery protocol for IP multicast networks. RMT WG designs LCT for multicast protocols and designs LCT to be compatible with WEBRC and FEC. LCT does not require any backchannel and works well with any LAN, WAN, Intranet, Internet, asymmetric NW, wireless NW or Satellite NW (RFC5651 9). LCT works best for at least multi-GB objects that are transmitted for at least 10s of seconds. Streaming applications benefit greatly from LCT. [RFC5651 4]

LCT Architecture Definition

LCT uses a single sender that transmits objects (interesting to receivers) via packets to multiple channels for some period of time. These channels split the objects into packets and associate the packets with the object using headers. LCT works with WEBRC to provide multiple-rate congestion control. Receivers join and leave LCT layers (via ALC channels) during participation in a session to reach their target reception rate (see WEBRC).

As the name suggests, LCT uses layered coding to produce a coded stream of packets that LCT partitions into ordered sets of packets. The FEC building block codes the packets for reliability. For streaming media applications, layering allows variable transfer speeds and by extension image quality to RX with arbitrary NW capacity. The best example of LCT follows.

Imagine a web TV application split into three layers. A RX that joins the first channel would receive a black and white picture. An RX that had more capacity would join the first and second channel and receive a color picture. An RX with transparent capacity would be able to join all three layers, and receive a HD color picture. The key to this example is that the sender does not duplicate any data between layers. The RX joins successive layers to receive a higher quality picture at the cost of using more bandwidth. [RFC5651 6]

 

 

 

LCT Operations

The WEBRC building block sends packets associated with a single session to multiple LCT channels at rates computed to optimize multiple-rate congestion control (RFC3738 3). The receivers join one or more channels according to the NW congestion. The WEBRC building block provides LCT with information for the CCI field, which is opaque to LCT (RFC5651 16). The FEC building block codes the packets that LCT sends to channels for reliability.

On the RX side, the RX must first join an LCT session. The RX must obtain enough of the session description parameters to start the session. Once the RX has all the session description parameters the RX begins to process packets. The RX must identify & de-multiplex the packets associated the LCT session. Each LCT session must have a unique Transport Session Identifier (TSI). The LCT session scopes the TSI by the (Sender IP Address, TSI) pair. LCT stamps each packet’s LCT header with the appropriate TSI. [RFC5651 25-26]

The RMT WG designed LCT for best effort (BE) service. BE service does not guarantee packet reception or packet reception order. BE service does not provide support for reliability or flow/ congestion control. LCT does not provide any of these services on its own. ALC, however, uses LCT along with FEC and WEBRC to provide reliable, multi-rate congestion controlled layered transport. [RFC5651 27]

Reliable Multicast Building block: WEBRC

WEBRC Description

As per RFC 2357, the use of any reliable multicast protocol in the Internet requires an adequate congestion control scheme. Furthermore, ALC must support RFC3738, the Wave and Equation Based Rate Congestion Control (WEBRC) Building Block (RFC5775 10). WEBRC provides multiple rate congestion control for data delivery. Similar to FCAST, ALC, LCT and multicast FEC, the RMT WG designs WEBRC to support protocols for IP Multicast. In the spirit of massive scalability, WEBRC requires no feedback and uses a completely receiver driven congestion control protocol. WEBRC enables a single sender to deliver data to individual receivers at the fastest possible rate, even in a highly heterogeneous network architecture. In other words, WEBRC dynamically varies the reception rate of each RX independent of other receivers (RFC3738 1). WEBRC competes fairly with TCP and similar congestion control sessions (RFC3738 4).

WEBRC Architecture Definition

A single sender transmits packets to multiple channels. The sender designates one channel as the base channel, the remaining are called wave channels. Each channel starts off at a high packet rate, after each equal-spaced period of time, the packet rate of that channel reduces until the channel is quiescent. A channel’s cycle from full rate to quiescence takes a configurable number of periods, by default their aggregate summing to a long duration of time (several minutes). At the end of each period, the RX joins or leaves channels depending on if the aggregate of the current TX rates allows the RX to reach its target RX rate. At the end of each period the RX orders each wave channel into layers, based on their TX rates (the higher the rate, the higher the layer). The designation of wave channel to a layer, therefore, varies cyclically over time. Once joined, an RX stays with a channel until that channel becomes quiescent. [RFC3738 8]

A key metric for each receiver, therefore, is the target reception rate. The target reception rate drives the number of layers (and by extension, channels) that a receiver must join. The RX measures and performs calculations on congestion control parameters (e.g. the average loss probability and the average RTT) and makes decisions on how to increase or decrease its reception rate based on these parameters. The RX based approach of WEBRC suits itself to protocols where the sender handles multiple concurrent connections and therefore WEBRC is suitable as a building block for multicast congestion control. An RX with a slow connection does not slow down RX with faster connections. [RFC3738 13-23]

WEBRC Operations

When WEBRC receives packets from ALC, WEBRC first checks to see that the packets belong to the appropriate session before applying WEBRC. ALC uses LCT, so WEBRC looks to the LCT header to find the (sender IP address, TSI) tuple that denotes what session a received packet belongs to (RFC5651 12). The multicast network identifies a channel to receivers via a (sender IP address, multicast group address) pair, and the receiver sends messages to join and leave the channel to the multicast group address. When the RX initiates a session, it must join the base channel. The packets on the base channel help the RX orient itself in terms of what the current time slot index is, which in turn allows the RX to know the relative rates on the wave channels. The RX orders these wave channels into layers, from lowest to highest rates. The RX remains joined to the base channel for the duration of its participation in the session. [RFC3738 8]

As mentioned earlier, the lowest layer has lowest rate and highest layer has highest rate. Each time a wave channel becomes active, it becomes the highest layer. At the end of each time slot the lowest-layer wave channel deactivates and all channels move down a layer. A RX always leaves the lowest layer when it deactivates.

After joining a session, the RX adjusts its rate upwards by joining wave channels in sequence, starting with the lowest layer and moving towards the highest. The rates on the active wave channels are decreasing with time so the receiver adjusts its rate downward simply by refraining from joining additional wave channels. The layer ordering among the channels changes dynamically with time so the RX must monitor the Current Time Slot Indicator (CTSI).

Once the receiver joins a wave channel, the receiver remains joined to the wave channel until it deactivates (RFC3738 8). The following diagram illustrates the relationship between wave channels, layers and target reception rate.

 

 

In the above figure, assume the receiver wants a target rate of 7λ/4 packets per second (pps). This means the receiver must join the base (λ/4pps), layer 0 (λ/4pps), layer 1 (λ/2pps) and layer 2 (3λ/4pps). The receiver joins layers by joining underlying channels, sending joins and leaves to their respective multicast addresses. We see in the figure that for time t, layer 2 contains wave channel 4, layer 1 contains wave channel 3 and layer 0 contains wave channel 2. The receiver leaves channel 1 (which is now quiescent). The receiver stays joined to the base and wave channels 3 and 2. The receiver sends a join to wave channel 4. At time t+1, the layers change again. The receiver stays joined to the base, 4 and 3. The receiver leaves channel 2 and joins channel 0. For time t+2, the receiver stays joined to the base, 0 and 4. The receiver leaves channel 3 and joins channel 1.

Reliable Multicast Building Block: FEC

FEC Building Block Description

Content Delivery Protocols (CDP) have many options available to them to increase reliability. We’ll first read about two non-forward error correction (FEC) based options: automatic request for retransmission (ARQ) and data carousels. First, consider ARQ. If an ARQ receiver does not receive a packet or receives a corrupted packet, the receiver asks the sender to re-transmit the packet. ARQ therefore, requires a back channel and does not scale well for one to many CDP. Using ARQ on one to many CDP sets the architecture up for feedback implosions and “NACK of death” (imagine 1e+7 receivers simultaneously detecting dropped data and asking for a re-transmission). In addition, in a network where different receivers have different loss patterns, ARQ wastes resources. RX would need to wait for the re-transmissions of packets that other receivers lost, even if the RX already have those data. [RFC5052 2]

A data carousel solution partitions objects into equal length pieces of data (source symbols), puts them into packets and cycles through and sends these packets. Each RX receives the packets until they have a copy of every packet. While the data carousel solution requires no back channel, if an RX misses a packet, the RX has to wait on Carousel until it’s sent again. [RFC6968 8]

 

 

RFC 3454 describes, therefore, how to use FEC codes to augment/ provide reliability for one-to-many reliable data transport using IP multicast. RFC 3454 uses the same packets containing FEC data to simultaneously repair different packet loss patterns at multiple RX. [RFC3453 4]

FEC has multiple benefits for our FCAST/ALC architecture. FEC augments reliability and overcomes erasures (losses) and bit level corruption. The primary application of FEC to IP multicast, however, is an erasure code since the IP multicast NW layers detect (bit level) corrupted packets and discard them (or the transport layers will use packet authentication to discard corrupted packets) (RFC3453 3).

FEC Operation

The data source inputs into FEC some number k of equal length source symbols. The FEC encoder then generates some number of encoding symbols that are of the same length as the source symbols. The packets are placed into packets and then sent. On the receiving side, the RX feeds the encoded symbols into a decoder to recreate an exact copy of the k source symbols. ALC can use block or expandable FEC codes for the underlying FEC building block. [RFC5775 11]

With a block encoder, we input k source symbols and a constant number n. The encoder generates a total of n encoding symbols. The encoder is systematic if it generates n-k redundant symbols yielding an encoding block of n encoding symbols in total composed of the k source symbols and the n-k redundant symbols. With a block encoder, any k of the n encoding symbols in the encoding block is sufficient to reconstruct the original k source symbols. [RFC3453 5-6]

An expandable FEC encoder takes input of k source symbols and generates as many unique encoding symbols as requested on demand. At the receiver side, any k of the unique encoding symbols is enough to reconstruct the original k source symbols. [RFC3453 7]

Conclusion

This post discusses three technologies that enable ALC for reliable multicast: LCT, WEBRC and the FEC building block. The next blog post discusses an Architecture that integrates all of the enabling technologies.

Bibliography

[RFC3453] Luby, M., Vicisano, L., Gemmell, J., Rizzo, L., Handley, H. and J. Crowcroft, “The Use of Forward Error Correction (FEC) in Reliable Multicast”, RFC 3453 December 2002.

[RFC3738] Luby, M. and V. Goyal, “Wave and Equation Based Rate Control (WEBRC) Building Block”, RFC 3738, April 2004.

[RFC5651] Luby, M., Watson, M. and L. Vicisano, “Layered Coding Transport (LCT) Building Block”, RFC 5651, October 2009.

[RFC5775] Luby, P., Watson, P. and L. Vicisano, “Asynchronous Layered Coding (ALC) Protocol Instantiation”, RFC 5775, April 2010.

[RFC6968] Roca, V. and B. Adamson, “FCAST: Scalable Object Delivery for the ALC and NORM Protocols”, RFC 6968, July 2013.

[RFC5052] Watson, M., Luby, M. and L. Vicisano, “Forward Error Correction (FEC) Building Block”, RFC 5052, August 2007.

Reliable Multicast at Internet Scale (Part 1): FCAST and ALC

Reliable multicast?!? How on earth can you guarantee transport using a unidirectional, asynchronous delivery method? Could you scale your solution to support ten million downstream users, each with a different capacity, varying from dial up to 100GbE metro Ethernet? Surprisingly, you can!

In this series of blog posts I discuss a few interesting technologies that provide massively salable reliable multicast. In summary, you can guarantee reliable multicast using a “data carousel,” and can handle the non-uniform capacity issues with the idea of layered coding. Users receive the layers that their capacity supports, the more layers they subscribe to, the higher quality video stream they receive. Read the next few blog posts to dive into the fascinating details.

Background

Warner Brothers pictures begins filming the John Carmack biopic (starring Anthony Michael Hall as John Carmack and Dwayne “The Rock” Johnson as John Romero) next month. The film depicts Carmack’s life, from shareware coder to superstar “Doom” engine developer to the founder of Armadillo aerospace. To feed the geek buzz surrounding the picture, Warner Brothers will debut the film on the Internet, providing a one time, free Multicast on December 10th, 2018… the 25th anniversary of the initial release date of “Doom”.

Warner Brothers predicts millions of subscribers to this one time Multicast, and (should) contract Freshlex, LLC to develop the enabling architecture. In other words, Warner Brothers wants to transmit a single video feed to millions of Internet receivers, each with arbitrary network capacity. Warner Brothers needs a reliable, massively scalable solution.

These blog posts demonstrates the use of IETF standard protocols to provide a reliable, massively scalable solution with session management and multiple rate congestion control that stresses network fairness. Our solution looks at FCAST/ Asynchronous Layered Coding (ALC), designed by the IETF Reliable Multicast Transport (RMT) Working Group (WG) to provide just that.

This blog post describes FCAST and ALC, which enable reliable multicast. The next blog post describes the building blocks of ALC: LCT, WEBRC and FEC.

Reliable Multicast Building Block: FCAST

Description

FCAST provides object delivery over asynchronous layered coding (ALC). FCAST uses a lightweight implementation of the user datagram protocl (UDP)/ Internet protocol (IP) protocol stack to provide a highly scalable, reliable object delivery service that limits operational processing and storage requirements. Engineers should not consider FCAST as highly versatile, but for appropriate uses cases (such as the streaming video use case this paper discusses), FCAST is massively scalable and robust. [RFC6968 3]

Features

FCAST uses purely unidirectional transport channels for massive scalability. An engineer could hack FCAST to collect reception metrics but this limits scalability. FCAST favors simplicity, sending metadata and object content together in a compound object. The in-band approach, however does not allow a receiver (RX) to decide in advance if an object is of interest until the RX joins the session and processes the metadata portion of the compound object (RFC6968 9). An out-of-band metadata approach would obviate this setback, but remember, the driving requirement of the effort is massive scalability (RFC6968 4). The Reliable Multicast Transport Working Group (RMT WG) designs FCAST to be compatible with ALC and the ALC building blocks: FEC, WEBRC and LCT (RFC5775 1).

Architecture Definition

FCAST provides a content delivery service and transmits objects to a (very large) group of receivers in a reliable way. An engineer could use FCAST over negative acknowledgement (NACK) Oriented Reliable Multicast (NORM) but since the RMT WG designed NORM to use NACK, NORM does not fit the spirit of the Architecture. The Architecture, therefore integrates FCAST to use ALC. Nothing about FCAST limits the maximum number of receivers. Using ALC provides the FEC building block and thus a measure of reliability. In addition, FCAST uses the concept of data carousels (described below) and the longer a carousel runs, the more reliable the content delivery service becomes. [RFC6968 6]

Components: The bullets below describe the FCAST components [RFC6968 5]:

  • Compound Object: Header (Includes metadata) + Object
  • Carousel: Compound object transmission system
  • Carousel Instance
    • Transmission system containing a collection of compound objects
    • Fixed set of registered compound objects that are sent by the carousel during a
      certain number of cycles
    • Note: whenever objects need to be added or removed, a new carousel object is
      defined
  • Carousel Instance Object (CIO): List of objects in the carousel instance
    • Note: The CIO is itself an object
    • Note: The CIO does not describe the objects themselves (e.g. no metadata)
  • Carousel Cycle: A period of time when all the objects are sent once
    • Transmission round within which all the registered objects in a Carousel Instance
      are transmitted a certain amount of times
    • By default, objects are sent once per cycle
  • Transmission Object Identifier (TOI)
    • The ID number associated to an object at the lower LCT (Transport) layer
  • FEC Object Transmission Information (FEC TOI)
    • Information required for coding

Operations

On the sender side of FCAST, a user first selects a set of objects to deliver to the receivers and submits the objects and the object metadata to the FCAST application. For each object, FCAST creates the compound object (header, metadata and the original object) and registers the compound object in the carousel instance. The user informs FCAST when he completes submission of all the objects in the set. If the user knows that no other object will be submitted then it informs FCAST accordingly. The user then specifies the desired number of cycles. For the most part the user can correlate the number of cycles with reliability. FCAST, nonetheless, now knows the full list of compound objects that are part of the carousel instance and creates a CIO (if desired) with a complete flag (if appropriate). The FCAST application then defines a TX schedule of these compound objects, including the CIO. The schedule defines which order the packets of the various compound objects are sent. FCAST now starts the carousel transmission for the number of cycles specified and continues until (1) FCAST completes the desired number of TX cycles (2) the user wants to kill FCAST or (3) the user wants to add or remove objects, in which case FCAST must create a new CI. [RFC6968 12]

On the receiver side of FCAST, the RX joins the session and collects encoded symbols. Once the RX receives the header the RX processes the metadata and chooses to continue or not. Once the RX receives the entire object the RX process the headers retrieves the metadata, decodes the metadata and processes the object. When the RX receives a CIO (a compound object with the “I” bit set) the receiver decodes the CIO and retrieves the list of compound objects that are part of the current carousel instance (and can also determine which compound objects have been removed). If the RX receives a CIO with the complete flag set, and the RX has successfully received all the objects of the current carousel instance, the RX can safely exit the current FCAST session. [RFC6968 13]

Reliable Multicast Building Block: ALC

Description

Asynchronous Layered Coding (ALC) provides massively scalable, asynchronous, multirate, reliable, network friendly content delivery transport to an unlimited number of concurrent receivers from a single sender. Three building blocks comprise ALC: (1) IETF RFC5651 Layered Coding Transport (LCT) for Transport Layer control, (B) IETF RFC3738 Wave and Equation Based Rate Control (WEBRC) for multi-rate congestion control and (3) IETF RFC3454 IP multicast forward error correction (FEC) for reliability [RFC5775 1]. The Reliable Multicast Transport (RMT) working group (WG) designs ALC for IP multicast, although an engineer can use it for unicast. ALC has no dependencies on IP version.

The diagram below shows the FCAST/ALC architecture and packet format.

 

 

Features

ALC has advantageous features. The RMT WG designates scalability as the primary design goal of ALC. IP multicast by design is massively scalable, however, IP multicast only provides a best effort (BE) service devoid of session management, congestion control or reliability. ALC augments IP multicast with session management, congestion control and reliability without sacrificing massive scalability. As a result, the number of concurrent receivers for an object is theoretically infinite, and in practice potentially in the millions. [RFC5775 4]

ALC provides reliable asynchronous transport for a wide range of objects. The aggregate size of delivered objects can vary from hundreds of kilobytes (KB) to terabytes (TB). Each receiver (RX) initiates reception asynchronously and the reception rate for each RX is the maximum fair bandwidth available between the receiver and sender. In other words, each RX believes it has a dedicated session from TX to RX, with rate adjustments that match the available bandwidth at any given time. The building blocks of ALC allow it to perform congestion control, reliable transport and session layer control without the need for any feedback packets. The lack of any channel from receiver to sender enables ALC to be massively scalable. [RFC5775 5]

Architecture

ALC transports one or more objects to multiple receivers using a single sender and a single session. An application (such as FCAST) provides data objects to ALC. ALC generates packets from these objects, formats them and then hands them off to the lower layer building blocks. The FEC building block encodes them for reliability. The LCT building block provides in-band session management and places the objects onto multiple transmission channels. The WEBRC building block places the data onto the channels at rates optimized for multiple-rate, feedback free congestion control. The RX joins appropriate channels associated with the session, joins or leaves channels for congestion control and uses the ALC, LCT and FEC information to reliably reconstruct the packets into objects. Thanks to the FEC building block, the RX simply waits for enough packets to arrive to reliably reconstruct the object. The ALC architecture does not provide any ability for a RX to request a re-transmission. Thanks to the focus on massive scalability the rate of transmission out of the TX is independent of the number and individual reception experience of the RX. [RFC5775 7]

ALC Session

The concept of an ALC session matches that of LCT. A session contains multiple channels from a single sender used for some period of time to carry packets pertaining to the TX of objects interesting to receivers (RFCS5651 4-5). ALC performs congestion control over the aggregate of the packets sent to channels belonging to a session (RFC5775 7).

ALC Session Description

An ALC session requires a session description. Any receiver that wants to join an ALC session must first obtain the session description. A discussion of how to get this session description to the receivers follows in the “integration choices” section of this paper. The session description contains the following information (RFC5775 12):

  • Sender IP Address
  • Number of channels in the session
  • Multicast address and port # for each channel in the session
  • Data Rates used for each channel
  • Length of each packet payload
  • Transport Session Identifier (TSID) for the session
  • An indication if the session carries packets for more than one object
  • Whether the session describes required FEC information (RFC5052) out of band or in-band (using header extensions)
  • Whether the session uses Header Extensions, and if so the format
  • Whether the session uses packet authentication, and if so the scheme
  • The MRCC building block used (The ALC RFC recommends WEBRC, so we use that in this paper)
  • Mappings between settings and Codepoint Value (for example, if different objects use different FEC or authentication schemes, the Codepoint values distinguish them)
  • Object metadata such as when the objects will be available and for how long

Operations

The integration of three building blocks defines ALC, so first and foremost, the sender follows all operations associated with the LCT, FEC and WEBRC building blocks. ALC nonetheless, first makes available the required session description and FEC Object transmission information. As mentioned earlier, the session description contains the sequence of channels associated with the sender. ALC fills in the congestion control indication (CCI) field with information provided by the WEBRC building block. ALC then sends packets at appropriate rates to the channels as dictated by the WEBRC building block. ALC stamps every packet with the Transport Session ID (TSI), in case the receivers join sessions from other senders. If this particular session contains more than one object, then ALC stamps each packet with the appropriate transport object ID (TOI). ALC stamps the packet payload ID based on information from the FEC building block. As discussed in the “Security Validation” section of this paper, the IETF recommends packet authentication as a precaution. If an ALC instance does use packet authentication, it uses a header extension to carry the authentication information. [RFC5775 16]

The ALC RX also conforms to all operations required by LCT, FEC and WEBRC. The RX first obtains a session description and joins the session. The RX then obtains the in-band FEC Object Transmission Information for each object the RX wants. Upon receiving a packet the RX parses the packet header, and verifies that it is a valid header (discards packet if invalid). The RX verifies that the (Sender IP Address, TSI) tuple matches one of the pairs received in Session Description for the session the RX is currently joined to (if not, discard). The RX then proceeds with packet authentication and discards the packet if invalid. After valid packet authentication, the RX processes and acts on the CCI field in accordance with the WEBRC building block. If ALC carries more than one object in session, the RX verifies the TOI (and discards if not valid). The RX finally processes the remainder of the packet, interpreting other header fields and uses the FEC Payload ID and the encoding symbols in the payload to reconstruct the object. [RFC5775 17]

Conclusion

This blog post describes FCAST and ALC, which enable reliable multicast. The next blog post describes the building blocks of ALC: LCT, WEBRC and FEC.

Bibliography

[RFC5651] Luby, M., Watson, M. and L. Vicisano, “Layered Coding Transport (LCT) Building Block”, RFC 5651, October 2009.
[RFC5775] Luby, P., Watson, P. and L. Vicisano, “Asynchronous Layered Coding (ALC) Protocol Instantiation”, RFC 5775, April 2010.
[RFC6968] Roca, V. and B. Adamson, “FCAST: Scalable Object Delivery for the ALC and NORM Protocols”, RFC 6968, July 2013.

Let us now praise ugly code!

In this blog post I will revisit the first piece of code I wrote with the R Programming language, back in the early part of this decade.

Coming from an Octave/ MATLAB background, I really enjoyed the functional nature of R. I imagined flinging vectors into Matrices, collapsing them with dot-products, Tetris like. I refused to write a single for loop… I framed everything as functions and maps. As I gained experience with R, I found pipes and data wrangling libraries, but early on, my code was pretty ugly, as you will see shortly.

I have a project that keeps track of comic books, their publishers, their prices and their customers. The model stores data in excel and to make things readable, I use a columnar store. In this way, I can quickly add new entries to the table by adding columns.  Each column has an arbitrary number of rows. I know this might not be the best way to store data, but bear with me here. This blog looks at the processing of that data, not the storing of the data. Besides, in the real world, you sometimes have no choice but to start with ugly data.

 

The Ugly Way…

Let us proceed. First, take a look at Titles:

 

> Titles.orig <- data.frame(DC=c('Batman','Superman','Captain_Marvel',''),
                          Image=c('Youngblood','Spawn','',''),
                          Marvel=c('Spiderman','Iron_Man','Cable','Doctor_Strange'),
                          stringsAsFactors = FALSE)

> Titles.orig

              DC      Image         Marvel
1         Batman Youngblood      Spiderman
2       Superman      Spawn       Iron_Man
3 Captain_Marvel                     Cable
4                           Doctor_Strange

 

Notice that a rotation doesn’t really buy us anything. Instead of an arbitrary number of rows for each entry, a rotation gets us an arbitrary number of columns.

 

> t(Titles.orig)

       
[,1] [,2] [,3] [,4] DC "Batman" "Superman" "Captain_Marvel" "" Image "Youngblood" "Spawn" "" "" Marvel "Spiderman" "Iron_Man" "Cable" "Doctor_Strange"

 

When I process Titles.orig in R, I first transform it to a key-value store. My approach relies on data frame index logic (commands inside the [] brackets).

In my original approach, I create two vectors, one that repeats the column several times, and another that un-packs (unlists) the data. When I put them together, I get key-value pairs (with some empties).

My first vector repeats each column name n times, with n being the number of rows. Since the data frame has four rows, I repeat each column name four times. I first try the rep() function.

 

> Titles <- Titles.orig
> rep(names(Titles),nrow(Titles))
 [1] "DC"     "Image"  "Marvel" "DC"     "Image"  "Marvel" "DC"     "Image"  "Marvel" "DC"    
[11] "Image"  "Marvel"

 

This attempt fails. I want it in the form: ‘DC, DC, DC, DC, Image, Image etc.’

After a few Google searches, I find that matrix() allows us to stack rows, so I stuff the repeat statement into matrix():

 

> matrix(rep(names(Titles),nrow(Titles)),nrow=nrow(Titles))

     [,1]     [,2]     [,3]    
[1,] "DC"     "Image"  "Marvel"
[2,] "Image"  "Marvel" "DC"    
[3,] "Marvel" "DC"     "Image" 
[4,] "DC"     "Image"  "Marvel"

 

Close, but not quite what I need. I then add the byrow flag:

 

> matrix(rep(names(Titles),nrow(Titles)),nrow=nrow(Titles),byrow='T')

     [,1] [,2]    [,3]    
[1,] "DC" "Image" "Marvel"
[2,] "DC" "Image" "Marvel"
[3,] "DC" "Image" "Marvel"
[4,] "DC" "Image" "Marvel"

 

From here, we convert to a vector:

 

> as.vector(matrix(rep(names(Titles),nrow(Titles)),nrow=nrow(Titles),byrow='T'))

 [1] "DC"     "DC"     "DC"     "DC"     "Image"  "Image"  "Image"  "Image"  "Marvel" "Marvel"
[11] "Marvel" "Marvel"

 

As you can see, vector works “down the column” by default (which makes sense, since columns are vectors).

Let’s move past the titles. To create a vector from our data, we need to unlist() the data first and then vectorize it:

 

> as.vector(unlist(Titles))

 [1] "Batman"         "Superman"       "Captain_Marvel" ""               "Youngblood"    
 [6] "Spawn"          ""               ""               "Spiderman"      "Iron_Man"      
[11] "Cable"          "Doctor_Strange"

 

I bind these two vectors together as columns and then create a data frame.

 

> Titles <-  data.frame(cbind(as.vector(matrix(rep(names(Titles),nrow(Titles)),
                                             nrow=nrow(Titles),byrow='T')),
                            as.vector(unlist(Titles))))
> Titles

       X1             X2
1      DC         Batman
2      DC       Superman
3      DC Captain_Marvel
4      DC               
5   Image     Youngblood
6   Image          Spawn
7   Image               
8   Image               
9  Marvel      Spiderman
10 Marvel       Iron_Man
11 Marvel          Cable
12 Marvel Doctor_Strange

 

I give names to the data:

 

> names(Titles) <- c('publisher','title')

 

And then remove the empty rows. A lot of my early code follows this convention. I scan a data frame with index logic, using a comma to separate row and column logic. In the line below, I scan the index to return only rows that have a non-empty title, and return all columns. Such syntax appears a little confusing, as I reference the data frame Titles in three separate parts.

 

> Titles <- Titles[which(Titles$title != ""),]

> Titles
   publisher          title
1         DC         Batman
2         DC       Superman
3         DC Captain_Marvel
5      Image     Youngblood
6      Image          Spawn
9     Marvel      Spiderman
10    Marvel       Iron_Man
11    Marvel          Cable
12    Marvel Doctor_Strange

 

The Pretty Way…

Let’s recap. We had nested hell to transform the columnar table to a key-value table, and then we needed two more commands to name the data frame columns and remove the empties.

With pipes (dplyr and magrittr) and tidyr, we can produce the same result with one line of code.

 

> library("dplyr")
> library("magrittr")
> library("tidyr")
> Titles <- Titles.orig
> Titles %>% gather(publisher,title) %>% filter(nzchar(title))

  publisher          title
1        DC         Batman
2        DC       Superman
3        DC Captain_Marvel
4     Image     Youngblood
5     Image          Spawn
6    Marvel      Spiderman
7    Marvel       Iron_Man
8    Marvel          Cable
9    Marvel Doctor_Strange

 

To dump and then set the variable, we use the %<>% pipe.

 

> Titles %<>% gather(publisher,title) %>% filter(nzchar(title))

 

More Pretty Code

Now we have a separate table of customers. This is a more traditional table, and we can arbitrarily add columns and rows as we see fit.

 

> Customers <- data.frame(title = c('Batman','Superman','Captain_Marvel',
                                  'Youngblood','Spawn','Spiderman','Iron_Man','Cable','Doctor_Strange'),
                        Micky = c(2,0,0,0,0,0,2,0,1),Mike = c(5,1,1,1,1,1,1,1,1),
                        Peter = c(1,1,0,0,0,1,1,2,0),
                        Davy = c(2,7,1,5,1,2,0,0,1),
                        stringsAsFactors=FALSE)
> Customers

           title Micky Mike Peter Davy
1         Batman     2    5     1    2
2       Superman     0    1     1    7
3 Captain_Marvel     0    1     0    1
4     Youngblood     0    1     0    5
5          Spawn     0    1     0    1
6      Spiderman     0    1     1    2
7       Iron_Man     2    1     1    0
8          Cable     0    1     2    0
9 Doctor_Strange     1    1     0    1

 

Let’s try the gather function on this table to see what we get. We want each row to contain the comic title, the customer name, and the quantity they want to purchase.

 

> Customers %>% gather(customer,qty) %>% suppressWarnings %>% head(12)

   customer            qty
1     title         Batman
2     title       Superman
3     title Captain_Marvel
4     title     Youngblood
5     title          Spawn
6     title      Spiderman
7     title       Iron_Man
8     title          Cable
9     title Doctor_Strange
10    Micky              2
11    Micky              0
12    Micky              0

 

As you can see, this is not what we want. For correct syntax, we need to specify a start and end column.

 

> Customers %>% gather(customer,qty,Micky:Davy) %>% head(12)

            title customer qty
1          Batman    Micky   2
2        Superman    Micky   0
3  Captain_Marvel    Micky   0
4      Youngblood    Micky   0
5           Spawn    Micky   0
6       Spiderman    Micky   0
7        Iron_Man    Micky   2
8           Cable    Micky   0
9  Doctor_Strange    Micky   1
10         Batman     Mike   5
11       Superman     Mike   1
12 Captain_Marvel     Mike   1

 

I have an issue with this code in that I need to refactor it each time I add a new customer.

To future proof, we modify the code as follows:

 

> Customers %>% gather(customer,qty,2:ncol(Customers)) %>% head(12)

 

In a separate table I have prices for each title.

 

> Price <- data.frame(title = c('Batman','Superman','Captain_Marvel','Youngblood',
                              'Spawn','Spiderman','Iron_Man','Cable','Doctor_Strange'), 
                    price = c(1.95,1.95,2.95,2.95,1.75,1.75,3.95,3.95,1.95), 
                    stringsAsFactors = FALSE )
> Price

           title price
1         Batman  1.95
2       Superman  1.95
3 Captain_Marvel  2.95
4     Youngblood  2.95
5          Spawn  1.75
6      Spiderman  1.75
7       Iron_Man  3.95
8          Cable  3.95
9 Doctor_Strange  1.95

 

We can easily add a price column to Customers with the merge() function:

 

> Customers %>% merge(Price)

           title Micky Mike Peter Davy price
1         Batman     2    5     1    2  1.95
2          Cable     0    1     2    0  3.95
3 Captain_Marvel     0    1     0    1  2.95
4 Doctor_Strange     1    1     0    1  1.95
5       Iron_Man     2    1     1    0  3.95
6          Spawn     0    1     0    1  1.75
7      Spiderman     0    1     1    2  1.75
8       Superman     0    1     1    7  1.95
9     Youngblood     0    1     0    5  2.95

 

Pretty Showdown:  Hard vs. Easy

How do we find per customer totals? I’ll show a hard way and an easy way. Let’s look at the pipe/ dplyr/ tydr method first.

First, we narrow the table and merge with price:

 

> Customers %>% gather(customer,qty,2:ncol(Customers)) %>% 
  merge(Price) %>% head(12)

            title customer qty price
1          Batman    Micky   2  1.95
2          Batman     Davy   2  1.95
3          Batman    Peter   1  1.95
4          Batman     Mike   5  1.95
5           Cable     Davy   0  3.95
6           Cable    Peter   2  3.95
7           Cable     Mike   1  3.95
8           Cable    Micky   0  3.95
9  Captain_Marvel     Davy   1  2.95
10 Captain_Marvel    Peter   0  2.95
11 Captain_Marvel     Mike   1  2.95
12 Captain_Marvel    Micky   0  2.95

 

Then, we add a fifth column that calculates the subtotal:

 

> Customers %>% gather(customer,qty,2:ncol(Customers)) %>% 
  merge(Price) %>% mutate(subtotal= qty * price) %>% 
  head(12)

            title customer qty price subtotal
1          Batman    Micky   2  1.95     3.90
2          Batman     Davy   2  1.95     3.90
3          Batman    Peter   1  1.95     1.95
4          Batman     Mike   5  1.95     9.75
5           Cable     Davy   0  3.95     0.00
6           Cable    Peter   2  3.95     7.90
7           Cable     Mike   1  3.95     3.95
8           Cable    Micky   0  3.95     0.00
9  Captain_Marvel     Davy   1  2.95     2.95
10 Captain_Marvel    Peter   0  2.95     0.00
11 Captain_Marvel     Mike   1  2.95     2.95
12 Captain_Marvel    Micky   0  2.95     0.00

 

Then, we sum the subtotal for each customer. We can achieve this with ease using the group_by() and summarize() functions:

 

> Customers %>% gather(customer,qty,2:ncol(Customers)) %>% 
  merge(Price) %>% mutate(subtotal= qty * price) %>% 
  group_by(customer) %>% summarize(sum(subtotal))

# A tibble: 4 x 2
  customer sum(subtotal)
     <chr>         <dbl>
1     Davy         42.45
2    Micky         13.75
3     Mike         30.95
4    Peter         17.50

 

POP quiz… did we just execute the hard or easy method to find the totals? I will show you the easy way next and you can decide for yourself. In short, we can solve this problem with simple linear algebra.

We first create our vector

 

> x <- Price$price

 

Then our matrix

 

> A <- Customers %>% select(Micky:Davy) %>% as.matrix()

 

We do a simple dot product and we’re done:

 

> x %*% A

     Micky  Mike Peter  Davy
[1,] 13.75 30.95  17.5 42.45

 

We could also do it in one line:

 

> Price$price %*% (Customers %>% select(Micky:Davy) %>% as.matrix())

     Micky  Mike Peter  Davy
[1,] 13.75 30.95  17.5 42.45

 

My Octave/ MATLAB experience led me to use linear algebra right out of the gate. Sometimes, even in the face of fancy new functions, it turns out I produce beautiful code on the first try.