Kai Hendry's other blog archives

Experiencing CoreOS+Docker

Docker Logo

Once upon a time there was chroot (notice, it's just 50LOC?!). chroot was a simple way of sandboxing your application. It didn't really work as well as some people wanted and along came Docker, which is a front end to LXC. It works, though it has a LOT of SLOC/complexity/features. Docker is monolithic and depends on Linux.

Today we have general packaged distributions like Debian & Archlinux. Their main fault was probably being too general, poor abilities to upgrade and downgrade. Along comes CoreOS, a lightweight Linux OS with (a modern init) systemd & docker. CoreOS is also monolithic and depends on Linux.

I've attempted to understand CoreOS before, though since I needed to move Greptweet to a VPS with more disk space... quickly... I "deep dived" into CoreOS & Docker and here is my writeup of the experience. Tip #1, the default user for CoreOS is "core", e.g. ssh core@ once you get for e.g. your CoreOS droplet going.


The 20LOC Greptweet Dockerfile took me almost all day to create, though this was my favourite accomplishment. I studied other Archlinux and Ubuntu docker files on Github to give me guidance how to achieve this.

So now I have a succinct file that describes Greptweet's environment to function. I found it astonishing the container for running a PHP app on Nginx is almost 1GB!

Yes, I need to re-implement greptweet in Golang to greatly reduce this huge bloat of a dependency!

Read only filesystem on CoreOS means no superuser or what?

I was hoping CoreOS would do away with root altogether. I'm seriously tired of sudo. I noticed read only mounts, whilst trying to disable password ssh logins to avoid loads of:

Failed password for root from $badman_IP port $highport ssh2

In my journalctl. Ok, if they are going to fix the config of sshd_config I thought, maybe they would do away with root?! PLEASE.

Haunting permissions

I hate UNIX permissions, hate hate hate. So with Docker your data is mounted on the host IIUC and your app stays firmly containerized.

But when your app writes data out on to a mount point, what THE HELL should the permissions be? I ended up just chmod -R 777 on my Volume's mountpoint, though I should probably have used setfacl What a mess!

User/group 33

How am I supposed to log CoreOS/Docker?!

I'm confused about Volume mounts. I run Greptweet like so: /usr/bin/docker run --name greptweet1 -v /srv/www/greptweet.com:/srv/http/u -p 80:80 greptweet, and /srv/http/u/ is where the data lives. But HOW am I supposed to get at my container's logs? Another volume mount?

How does CoreOS envision managing httpd logs? I don't understand. And how am I supposed to run logrorate!? "One step forward, two steps back" is playing in my mind.


A jarring thing is that when you run a docker container, you IIUC are expected to run one process, i.e. the httpd.

Unfortunately with nginx, to get PHP working you need to run a separate (FastCGI) PHP process to nginx httpd, hence the Greptweet Dockerfile uses Python's supervisor daemon to manage both processes. Urgh. I copied this paradigm from another Dockerfile. Tbh I was expecting to manage the process with systemd inside the container. Now I have Python crapware in my container for managing nginx/php processes. Suck.

NO Cron

Greptweet used cron to create backups, relay stats and generate reports. Now AFAICT I don't have the basic utility of cron in my container. Now what?!



As mentioned in my previous blog on CoreOS, I was quite excited about have "free" updates to my core host system. Sadly after looking at the logs, I'm not impressed.

There is little visibility to the actual update. I have recently found https://coreos.com/releases/ but it uses some horrible XML manifest to layer on the updates. Why can't the whole rootfs just be in git ffs?

Furthermore I noticed locksmithd which I think reboots the machine, but I'm not sure.

Oct 18 03:11:11 dc update_engine[458]: <request protocol="3.0" version="CoreOSUpdateEngine-" updaterversion="CoreOSUpdateEngine-0
Oct 18 03:11:11 dc update_engine[458]: <os version="Chateau" platform="CoreOS" sp="444.5.0_x86_64"></os>
Oct 18 03:11:11 dc update_engine[458]: <app appid="{e96281a6-d1af-4bde-9a0a-97b76e56dc57}" version="444.5.0" track="stable" from_track="
Oct 18 03:11:11 dc update_engine[458]: <ping active="1"></ping>
Oct 18 03:11:11 dc update_engine[458]: <updatecheck targetversionprefix=""></updatecheck>
Oct 18 03:11:11 dc update_engine[458]: <event eventtype="3" eventresult="2" previousversion=""></event>
Oct 18 03:11:11 dc update_engine[458]: </app>
Oct 18 03:11:11 dc update_engine[458]: </request>

I've glanced over https://coreos.com/using-coreos/updates/ several times now and it's still not clear to me. As an operating system maintainer myself for Webconverger updates, our gitfs upgrade system is MUCH CLEARER than how CoreOS updates are handled. I wonder wth Docker 1.3 is going to hit CoreOS stable.

Keeping my Archlinux container uptodate is also a bit of a mystery to me...

CoreOS packaging is just WEIRD

It took me way too long to figure out how to enter a Docker 1.2 container and have a look. nsenter will be replaced by Docker 1.3's docker exec, but the way it installed was very intriguing.

In fact package management in CoreOS eyes I think means starting a share/privileged container and mapping it back to the host system. That's a pretty darn wild way of doing things imo.

I've been BATTLING TO GET TMUX running. It was suggested that this screen CoreOS install guide might help me. There is also an odd "toolbox" alias to a Fedora container with tools, but it doesn't map back to the host. All this for a terminal multiplexer. OMG.

Starting your Docker container in CoreOS was non-trivial

Here is Greptweet's service file.

CoreOS's launch guide was a bit strange to me. Am I supposed to publish my Greptweet image, so the docker pull works? It could be a lot simpler I feel. I.e. why doesn't the docker daemon manage/start the containers itself!?


I think the basic idea of lightweight host OS (CoreOS) and containers (Docker) has legs. I just wish it was as simple as chroot. Now I'm left thinking how Ansible/Chef/Puppet/Vagrant did such a bad job compared to the Dockerfile. Or perhaps blaming VPS hosters who never really got a decent API together to move/expand/inspect their VPS volumes.

Gosh, how did we get into this mess?!

So now system administrators now run hypervisors aka CoreOS and spin up VPSes aka Docker containers all by themselves. Seems like another level of abstraction that empowers system administrators but at the same time there is going to a raft of bugs/pain to enjoy with this "movement". It's also slightly concerning that CoreOS/Docker seems to fly in the face of the Unix philosophy.

How much does it cost to run an Archlinux mirror on EC2

AWS Singapore kindly gifted http://hackerspace.sg/ with 500SGD of AWS credits.

Since the mirrors http://mirror.nus.edu.sg/ and http://download.nus.edu.sg, which are two separate competing groups from the NUS which oddly try to outdo each other in incompetence, have had several issues mirroring Archlinux in my two year experience of using either of them, I thought lets use these credits to host an Archlinux mirror!!

After much head scratching with the AWS jargon of {ebs,s3} and {hvm,paravirtual} EC2 Archlinux images, I launched an "ebs hvm" instance of m3.xlarge.

I got a nice 80GB zpool going for the mirror and everything was looking good. However, now to do the budgeting.

On demand pricing is $0.392 an hour

There is roughly 9000 hours in a year. So that's $3528. Eeeek, over budget by just 3000 dollars!

Ignoring added complexity of Spot and EBS enhancements, a one year resevered instance under "Light Utilization Reserved Instances" (I am not sure what that means) is 497 dollars! Yes!!

I'm told "Light utilization means that you will not turn it on all the time". For 1 year I would need heavy utilization!

So a m3.xlarge would be: 981 (down payment) + 24 * 365 * 0.124 = $2067.24, about 1500 dollars over budget.

Oh and bandwidth?

Well, a mirror is going to be a network whore. AWS charges for bandwidth. I tried their calculator (since I couldn't figure out what they charge per GB) with a lowball 1TB a month in and out and that costs almost 200USD.

Wow that's expensive! AWS EC2 (+ 500SGD credit) isn't suitable for an Archlinux mirror! :(

Digital Ocean quote

For a machine with at least 50GB of disk, you would need Digital Ocean's 60GB offering, with

  • 4GB / 2 CPUS

So that is 40USD a MONTH or 480USD a year. A lot cheaper than EC2, and bandwidth clearly priced at 2c per GB, so 1TB = 20USD IIUC.

Lessons learnt

Running a mirror is quite expensive on EC2. It's not really feasible on DO either without some free unmetered traffic.


Latest tips

Finding the rotation of a iPhone video

Using ffprobe which should be included in a ?ffmpeg(https://twitter.com/FFmpeg) distribution:

for m in *.MOV
        r=$(ffprobe $m 2>&1 | grep -i rotate | awk '{print $3}')
        case $r in
                        echo Needs to be $m 90 degrees
                        echo Needs to be $m 180 degrees
                        echo Needs to be $m 270 degrees
                        echo No rotating required $m
Working with a directories of unknown files

Using http://mywiki.wooledge.org/BashFAQ/020 as a starting point, you could:

find /tmp -type f -print0 | while IFS= read -r -d '' file
   echo properly escaped "$file" for doing stuff

However that's a bit ugly. And note that -d '' only works in bash. So none of this is "POSIX".

Another way of writing this, which works from bash 4 is using dotglob/globstar:

shopt -s dotglob  # find .FILES
shopt -s globstar # make ** recurse
for f in /tmp/**
    if <span class="createlink"><a href="/ikiwiki.cgi?page=_-f___36__f___38____38_____33___-L___36__f_&amp;from=e%2F13042&amp;do=create" rel="nofollow">?</a> -f &#36;f &#38;&#38; &#33; -L &#36;f </span>
        echo properly escaped "$f" for doing stuff

Another perhaps more POSIX way is

foo () { for i in "$@"; do echo $i; done };export -f foo;find /tmp -type f -exec bash -c 'foo "$@"' - {} + | wc -l

I.e. export a script function to be executed by the -exec parameter of find, or just use a seperate script file.

Ensure www-data is always able to write

Ensure your fs is mounted with acl.

 mount | grep acl
/dev/root on / type ext3 (rw,noatime,errors=remount-ro,acl,barrier=0,data=writeback)

And to ensure www-data always has free reign:

setfacl -R -m default:group:www-data:rwx /srv/www
Xorgs version
12:04 <hendry> i'm using wheezy Xorg packages  1:7.7+3~deb7u1 and http://ix.io/d2p says X.Org X Server 1.12.4
12:04 <hendry> Release Date: 2012-08-27
12:04 <hendry> Is that right?
12:05 <jcristau> probably
12:11 <hendry> wondering why there is a mis-match with versions
12:11 <hendry> is there a newer Xorg available for wheezy? something to eek out performance with intel cards
12:12 <jcristau> there isn't a mismatch
12:12 <jcristau> and no
12:12 <hendry> 1:7.7+3~deb7u1 & 1.12.4 doesn't make sense to me ... :}
12:13 <jcristau> you can't understand that different things can have different versions?
12:20 <hendry> so what does 7.7+3~deb7u1 refer to ?
12:22 <pochu> 7.7 is the upstream version, +3 is the debian revision, and deb7u1 is the first update to Debian 7 (wheezy)
12:22 <psychon> http://www.x.org/wiki/Releases/7.7/
12:28 <jcristau> 7.7 is the base version of X.Org's X11 distribution
12:28 <jcristau> 1.12.4 is the version of the X server


Setting a read S3 policy from the command line

Easier than logging into https://console.aws.amazon.com/s3/ since I need to get out my MFA device out everytime.

x220:/tmp$ bash allow-read.sh b3-webc
s3://b3-webc/: Policy updated

allow-read.sh is just a script to help write the policy:

x220:/tmp$ cat allow-read.sh
test "$1" || exit
s3cmd ls > $tmp
if ! grep -q $s3_bucket $tmp
        echo Could not find bucket s3://${s3_bucket}
        cat $tmp
cat <<END > $tmp
      "Principal": {
            "AWS": "*"
s3cmd setpolicy $tmp s3://${s3_bucket}

Powered by Vanilla PHP feedback form