Posts

  • Quick and easy VPNs with WireGuard

    WireGuard is the new kid on the block in the world of VPNs. It has been receiving a lot of attention lately, especially after Linus Torvalds himself praised the project last month, resulting in in-depth guides about its characteristics being published. The problem is that practical guides about its setup, including the official one, doesn’t show how quick and easy it is to do that. They are full of lengthy, complex and unneeded commands, when everything that is needed are simple configuration files.

    This guide won’t describe how to actually install WireGuard, as this is thoroughly covered by the official documentation for every supported platform. It consists of a loadable kernel module that allows virtual WireGuard network interfaces to be created. In here, an EC2 instance located in Ireland and a virtual machine (based on Vagrant/VirtualBox) in Germany, both running Ubuntu, will be connected.

    The first step is to generate a pair of keys for every machine. WireGuard authentication system doesn’t rely passwords or certificates that includes hard-to-maintain Certification Authorities (CAs). Everything is done using private/public keys, like SSH authentication:

    $ wg genkey | tee privatekey | wg pubkey > publickey
    $ ls -lh
    total 8.0K
    -rw-rw-r-- 1 ubuntu ubuntu 45 Sep 15 14:31 privatekey
    -rw-rw-r-- 1 ubuntu ubuntu 45 Sep 15 14:31 publickey
    

    In the server, the /etc/wireguard/wg0.conf configuration file will look like:

    [Interface]
    PrivateKey = 4MtNd3vq/Zb5tc8VgoigLyuONWoCQmnzLKFNuSYLiFY=
    Address = 192.168.255.1/24
    ListenPort = 51820
    PostUp = iptables -A FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE; sysctl net.ipv4.ip_forward=1
    PostDown = iptables -D FORWARD -i wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE; sysctl net.ipv4.ip_forward=0
    
    [Peer]
    PublicKey = 0+/w1i901TEFRmEcUECqWab/nwmq0dZLehMzSOKUo04=
    AllowedIPs = 192.168.255.2/32
    

    Here’s an explanation of its fields:

    • PrivateKey is the server private key. It proves that the server is who it says it is, and the same will be valid for the clients on the other end. One will be able to validate the identity of the other.
    • Address is the IP and network mask for the VPN network.
    • ListenPort tells in which UDP port the server will listen for connections.
    • PostUp are firewall rules and system commands that are needed for the server to act as a gateway, forwarding all network traffic. PostDown will disable them when the VPN is deactivated. eth0 is the name of the main network interface, which can be something different like ens5 if systemd’s Predictable Network Interface Names are being used.
    • PublicKey and AllowedIPs defines which peers can connect to this server through a combination of IPs/key pairs. It’s important to notice that the IPs defined here are within the VPN network range. Those are not the actual IPs which the client will use to connect to the server over the internet.

    The client will also have a /etc/wireguard/wg0.conf configuration file, but it will be a little bit different:

    [Interface]
    PrivateKey = yDZjYQwYdsgDmySbUcR0X7b+rdwfZ91rFYxz6m/NT08=
    Address = 192.168.255.2/24
    
    [Peer]
    PublicKey = e1HJ0ed/lUmCDRUGjCwFZ9Qm2Lt14jNE77TKXyIS1yk=
    AllowedIPs = 0.0.0.0/0
    Endpoint = ec2-34-253-52-138.eu-west-1.compute.amazonaws.com:51820
    

    The PrivateKey and Address fields here have the same meaning as in the server. The difference is that the Interface section won’t contain the server parts, like the listening ports and firewall commands. The Peer section contains the following fields:

    • PublicKey is the public key of the server which the client will connect to.
    • AllowedIPs is interesting here. It means the networks in which their traffic will be forwarded to the server. 0.0.0.0/0 means that all traffic, including connections that goes to the internet, will use the server as a gateway. Using the same VPN network, like 192.168.255.0/24, would create a P2P VPN, where the client and server will be able to reach each other, but any other traffic (e.g. to the internet) wouldn’t be forwarded through this connection.
    • Endpoint is the hostname or IP and port which the client will use to reach the server in order to establish the VPN connection.

    With both machines configured, the VPN interface can be enabled on the server:

    [[email protected]:~]$ sudo wg-quick up wg0
    [#] ip link add wg0 type wireguard
    [#] wg setconf wg0 /dev/fd/63
    [#] ip address add 192.168.255.1/24 dev wg0
    [#] ip link set mtu 8921 dev wg0
    [#] ip link set wg0 up
    [#] iptables -A FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE; sysctl net.ipv4.ip_forward=1
    net.ipv4.ip_forward = 1
    

    If a message like the following is shown in this step:

    Warning: `/etc/wireguard/wg0.conf' is world accessible
    

    This means that the configuration file permissions are too broad - and they shouldn’t, as there’s a private key in there. This can be fixed with sudo chmod 600 /etc/wireguard/wg0.conf.

    The command sudo wg shows the VPN status:

    [[email protected]:~]$ sudo wg
    interface: wg0
      public key: e1HJ0ed/lUmCDRUGjCwFZ9Qm2Lt14jNE77TKXyIS1yk=
      private key: (hidden)
      listening port: 51820
    
    peer: 0+/w1i901TEFRmEcUECqWab/nwmq0dZLehMzSOKUo04=
      allowed ips: 192.168.255.2/32
    

    From this point, the VPN can be enabled on the client using the same commands:

    [[email protected]:~]$ sudo wg-quick up wg0
    [#] ip link add wg0 type wireguard
    [#] wg setconf wg0 /dev/fd/63
    [#] ip address add 192.168.255.2/24 dev wg0
    [#] ip link set mtu 1420 dev wg0
    [#] ip link set wg0 up
    [#] wg set wg0 fwmark 51820
    [#] ip -4 route add 0.0.0.0/0 dev wg0 table 51820
    [#] ip -4 rule add not fwmark 51820 table 51820
    [#] ip -4 rule add table main suppress_prefixlength 0
    [[email protected]:~]$ sudo wg
    interface: wg0
      public key: 0+/w1i901TEFRmEcUECqWab/nwmq0dZLehMzSOKUo04=
      private key: (hidden)
      listening port: 47603
      fwmark: 0xca6c
    
    peer: e1HJ0ed/lUmCDRUGjCwFZ9Qm2Lt14jNE77TKXyIS1yk=
      endpoint: 34.253.52.138:51820
      allowed ips: 0.0.0.0/0
      latest handshake: 1 second ago
      transfer: 92 B received, 292 B sent
    [[email protected]:~]$ curl https://myhro.info/ip
    curl/7.47.0
    
    34.253.52.138
    
    IE
    

    The most impressive part is how quickly the connection is done. There are no multi-second handshakes like any other VPN solution and it can be used instantly. After playing with it, it becomes easier to understand why WireGuard is attracting so much attention. Specially because it’s so unobtrusive that one can use it without even realizing it’s turned on.

  • How to mock Go methods

    Warning: this post wouldn’t exist if it wasn’t the help of my long-time friend, previous university and work colleague, Fernando Matos. We discussed the possibilities for a few hours in order to figure the following implementations. I hope we can work together on a daily basis again in the future.

    Imagine the following Go code:

    main.go

    package main
    
    import "time"
    
    type Client struct {
    	timeout time.Duration
    }
    
    func New() Client {
    	return Client{timeout: 1 * time.Second}
    }
    
    func (c *Client) Fetch() string {
    	time.Sleep(c.timeout)
    	return "actual Fetch"
    }
    
    func main() {
    	c := New()
    	c.Fetch()
    }
    

    In this example, the Fetch() method is just sleeping for a pre-defined duration, but imagine that it is a real external API call, involving a slow and expensive network request. How can we test that?

    main_test.go

    package main
    
    import "testing"
    
    func TestFetch(t *testing.T) {
    	c := New()
    	r := c.Fetch()
    	t.Fatal(r)
    }
    

    If the actual Fetch() implementation is called, the test execution will take too long:

    $ go test
    --- FAIL: TestFetch (1.00s)
            main_test.go:8: actual Fetch
    FAIL
    exit status 1
    FAIL    _/Users/myhro/tmp     1.009s
    

    No one is going to wait a few seconds in each test run where this method is called a couple times. A naive approach to circumvent that would be trying to replace this method with another one with the same name that would avoid the slow operation:

    func (c *Client) Fetch() string {
    	return "mocked Fetch"
    }
    

    But in Go, this isn’t possible:

    ./main_test.go:5:6: (*Client).Fetch redeclared in this block
            previous declaration at ./main.go:13:6
    

    So we have to look for another solution, like the delegation design pattern. Instead of having the Fetch() method do what it is supposed to do, it delegates its responsibility to an encapsulated object.

    main.go

    package main
    
    import "time"
    
    type Client struct {
    	delegate clientDelegate
    	timeout  time.Duration
    }
    
    type clientDelegate interface {
    	delegatedFetch(time.Duration) string
    }
    
    func (c *Client) delegatedFetch(t time.Duration) string {
    	time.Sleep(t)
    	return "actual Fetch"
    }
    
    func New() Client {
    	n := Client{
    		delegate: &Client{},
    		timeout:  1 * time.Second,
    	}
    	return n
    }
    
    func (c *Client) Fetch() string {
    	return c.delegate.delegatedFetch(c.timeout)
    }
    
    func main() {
    	c := New()
    	c.Fetch()
    }
    

    This way, we can replace the implementation of this inner object without having to override the entire object that is being tested:

    main_test.go

    package main
    
    import (
    	"testing"
    	"time"
    )
    
    type fakeClient struct{}
    
    func (c *fakeClient) delegatedFetch(t time.Duration) string {
    	return "mocked Fetch"
    }
    
    func TestFetch(t *testing.T) {
    	c := New()
    	c.delegate = &fakeClient{}
    	r := c.Fetch()
    	t.Fatal(r)
    }
    

    Now the mocked Fetch() is called and the test execution finishes in no time:

    $ go test
    --- FAIL: TestFetch (0.00s)
            main_test.go:18: mocked Fetch
    FAIL
    exit status 1
    FAIL    _/Users/myhro/tmp     0.006s
    

    So the delegation pattern approach works, but there are a few drawbacks:

    • It needs an interface that is going to be used only by the methods that are supposed to be mocked;
    • The inner object can’t see its parent attributes, so they have to be passed as arguments;
    • This looks too verbose and there should probably be a shorter/simpler way to that.

    One cool thing about Go functions is that they can be treated as types, so they can be used as struct members or passed as arguments to other functions. This allows us to do things like:

    main.go

    package main
    
    import "time"
    
    type fetchType func(time.Duration) string
    
    type Client struct {
    	fetchImp fetchType
    	timeout  time.Duration
    }
    
    func sleepFetch(t time.Duration) string {
    	time.Sleep(t)
    	return "actual Fetch"
    }
    
    func New() Client {
    	n := Client{
    		fetchImp: sleepFetch,
    		timeout:  1 * time.Second,
    	}
    	return n
    }
    
    func (c *Client) Fetch() string {
    	return c.fetchImp(c.timeout)
    }
    
    func main() {
    	c := New()
    	c.Fetch()
    }
    

    And to replace the Fetch() implementation when testing:

    main_test.go

    package main
    
    import (
    	"testing"
    	"time"
    )
    
    func FakeFetch(t time.Duration) string {
    	return "mocked Fetch"
    }
    
    func TestFetch(t *testing.T) {
    	c := New()
    	c.fetchImp = FakeFetch
    	r := c.Fetch()
    	t.Fatal(r)
    }
    

    Achieving the same results:

    $ go test
    --- FAIL: TestFetch (0.00s)
            main_test.go:16: mocked Fetch
    FAIL
    exit status 1
    FAIL    _/Users/myhro/tmp     0.007s
    

    It’s interesting to notice that the FetchType declaration itself can be omitted, resulting in:

    type Client struct {
    	fetchImp func(time.Duration) string
    	timeout  time.Duration
    }
    

    Thus avoiding the creation of a dummy interface, type or struct only for mocking it later.

    Updates:

    1. Sandor Szücs pointed out that we have to care about not unintentionally exporting fake/internal methods or structs. Thanks!
  • How I finally migrated my whole website to the cloud

    This is going to be a tale of blood, sweat and tears, describing what I experienced over multiple years while trying to get rid of maintaining my own servers to host my website. This battle lasted for, literally, over a decade, until I was finally able to migrate every piece of code that comprises it (not just this blog). Since this week, it runs on a cloud infrastructure over multiple providers where I have little to no worries about its maintenance, offers more reliability than I actually need and it’s also pretty cheap.

    The myhro.info domain was registered in April 2007. Initially I had no real intentions of hosting anything on top of it, as it was more like “oh, now I have my first domain!”. In the first years, the web hosting was provided by 000webhost. It was free, but I also had no guarantee about its availability and faced some downtime every once in a while. This continued until, after finding some interesting offers on Low End Box, I migrated it to its own VPS server by the end of 2010. I remember the year because it was my first one at the Information Systems course, around the same time I got my first part-time System Administrator job. The experience I got maintaining my own Linux + Apache + PHP + MySQL (LAMP) server was crucial in the beginning of my professional career and some learnings from the time are still useful to me these days.

    In April 2011 this blog was started on a self-hosted WordPress installation, in the same previously mentioned server. At first I had almost no service to really care about its availability and probably the only exception was the Myhro.info URL Shortener (hosted under the myhro.net domain). The problem is that, after starting a blog, I had to worry about it being online at all times - otherwise people would not be able to read what I spent hours writing.

    Maintaining your own WordPress instance is not an easy job, even for small blogs. I spent endless hours fighting comment spam and keeping the installation secure and up-to-date. It was such a hassle that in less than two years, it was migrated to OctoPress, a static site generator in blog format, in the beginning of 2013. Publishing posts was now a matter of copying HTML files over rsync, but I still had to maintain a HTTP server for it. That’s why this blog was moved to GitHub Pages in 2014 and to Jekyll in 2015 and is still hosted there currently. Now I was free from maintaining a web server for it and this became a GitHub problem. At the same time the blog was migrated to Jekyll, its HTTPS support was re-enabled using Cloudflare (something that was lost in the GitHub Pages migration).

    Migrate blog.myhro.info to GitHub Pages + Cloudflare was marvelous and I haven’t worried about its maintenance ever since - not to mention that it also didn’t cost me a cent to do it. Now I had to take care of other parts of my website that required server-side scripts, like myhro.info/ip: a page that shows visitor’s IP address and user agent in a simple plain text format. It’s really handy to use it in the command line with curl and, in my experience, faster than using ifconfig.me. The main issue with this service is that it was written in PHP.

    I don’t remember exactly when was my first try of migrating the IP page to a cloud service, but it was probably between 2015 and 2016, when I tried AWS Lambda, rewriting it in a supported language. This didn’t worked, as to make a Lambda function available via HTTP, one have to use the Amazon API Gateway and it didn’t offered the possibility of using a simple endpoint like myhro.info/ip. I think this can be achieved with Amazon CloudFront, routing a specific path to a different origin, but it seemed too much work (and involved the usage of a bunch of different services) to achieve something that is really simple in nature. Trying to do the same using Google Cloud Functions yielded a similar experience.

    After these frustrating experiences, I stopped looking for alternatives. Maybe the technology to host a few dynamic pages (in this case, only one) for a website which has most static content wasn’t there yet. Then, after two hopeless years, I read the announcement of Cloudflare Workers which seemed exactly what I wanted: run code on a cloud service to answer specific requests. Finally, after reaching open beta and general availability, in 2018 I could truly and easily deploy small “serverless” applications tightly integrated to an already existing website. For that I just had to learn a little bit of JavaScript.

    It took me years of waiting and a few hours in a weekend to write JavaScript replacements for the PHP and Python (in the end I also migrated heroku.myhro.info, a service that returns random Heroku-style names) implementations, but I had finally reached the Holy Grail. Now it was a matter of moving the static parts of the website to Amazon S3, which is quite straightforward. S3 doesn’t offer HTTPS connections for static websites hosted in there, but as I already used Cloudflare, this was a no-brainer.

    So, Cloudflare Workers aren’t free (the minimum fee is $5/month), neither are they perfect. There are some serious limitations, like the “one worker per domain” restriction on non-Enterprise accounts, that can be a blocker for larger projects. But in this case, where I wanted to have a couple dynamic pages for a mostly static website, they fit perfectly. I’m also happy to pay an amount I consider reasonable for a service I’ve been using for free for years. Looking at the company recent innovations, they may become even better in the future.

  • People, not software, first

    During my first year at the university, I remember asking my Algorithms and Data Structures professor about how to properly handle user input in C. “If we are going to store this input, how we can be sure that the user only types an integer and not a string?”. I don’t actually recall his exact answer, but it was the first time that someone mentioned to me that handling user input is a recurring problem in Computer Science. And this was way before I learned about related problems like database input sanitization.

    Unfortunately, I learned the wrong lesson that day. To me, the problem was that we can’t trust the user, so we have to babysit every step of its human-computer interaction. If we don’t, they will crash our beloved hand-crafted software, which will make us - software developers - sad. And I couldn’t be more wrong. When a piece of software crashes, no one can ever gets sadder than the users who faced an error screen themselves. After all, they needed to use our software and weren’t able to.

    I took a few years to figure this out. Even a little after I’ve graduated at the university, I thought that every software that goes to production should be perfect. Every exception that can be raised in its code should be treated and it cannot ever crash in an unpredictable way. But in the real world, things do not work like this. And guess what? Users don’t care if your software isn’t perfect, as long as it suits their needs. All you have to care is about offering a friendly UI and giving proper feedback when things doesn’t work as expected.

    A couple weeks ago I found (on this post by Julia Evans - her blog is awesome, you should check it out) a series of tweets by Kelsey Hightower. He talks about how his working life got a lot more meaningful when he started to put people first. The part which I like most is when he mentions that computers are just machines waiting to break and that software is worse, because it’s always broken! Accepting that software isn’t just not perfect, but also broken by design, may be the best way to deal with the issues we face everyday in this industry.

    See, I’m not saying that we should be narcissist/nihilist professionals that don’t care about the quality of the work we publish to whomever our users are. I think that maybe, if we treat problems with its proper importance (e.g. pretty serious when it breaks user experience, and not so much if its a known problem that is not user-visible and we can ignore), we can feel a little more proud about the systems we maintain. Otherwise we’ll be doing a Sisyphus job, which can only led to an eternity of useless efforts and unending frustration.

  • Software engineering tips from a DevOps

    There’s a pattern that I’ve observed in nearly every software development company that I worked for. When discussing solutions with developer teams, two distinct things can happen: if it’s something related to operating systems, networking or infrastructure in general, they agree with me almost never arguing against anything. Sometimes this is bad, as I could possibly be missing an important detail which they thought that’s already figured out. On the other hand, when the matter is related to programming and software engineering, they ignore me virtually every time. It’s almost like if I’ve not even said anything meaningful.

    Based on this behavior, I’ve compiled a list of software engineering tips (or best practices) that I urge every developer to follow (and every DevOps/SysAdmin to remember them about it). None of these items were invented by me, nor they are just theoretical things. Those are battle-worn recommendations from someone who is part of this industry for quite a few years, which have seen many things that can go wrong when bad decisions are made. They are not hard to follow, but I’ve seen even experienced developers making the same mistakes mentioned here.

    Do not hardcode configuration parameters

    This one is tempting and source of “works on my machine” symptoms. We know that developers love to have control about what they are creating and sometimes afraid of someone running their software with badly formatted input. This is even worse with configuration parameters: “you put an ‘http://’ in there, please just use the hostname”. So this may be difficult to ask, but please, read configuration parameters from environment variables not only when you should, but every time you can. A default value can not only be useful for a specific environment (e.g. development), but also works as documentation for the expected format. See this Python example:

    import os
    
    database_url = os.environ.get('DATABASE_URL', 'postgres://user:[email protected]/app_db')
    redis_host = os.environ.get('REDIS_HOST', 'localhost')
    

    You can assume some things like “this web app will only run on port 80”, but there isn’t a way to know this for sure when it goes to production. It can be inside a container that have its ports forwarded or not. A reverse proxy can be in front of it and the app server will have to bind to a higher port. If the application can dynamically read this kind of information from its environment, you’ll be making both of our jobs way easier. I won’t have to ask you for changes and you won’t have to change the implementation (or worse, force me to do this).

    Do not try reinvent the wheel

    We all know: software development is great. You can ask a computer to do anything you want and it will do it over and over again. Because of this, you may be attracted by the idea that not only you can do anything, but also get the best possible solution without thinking too much about it. The reality is that this is not going to happen, maybe not even if you are truly a genius. I’ve seen this happening multiple times, specially with parsers for structured text, “things that can be solved with a clever regex” and the like.

    The advice I can do to avoid this is: be humble. Assume that your solution may not work for every case. Actually, the most important part is to realize that if you can’t think about an use case, it doesn’t mean that it doesn’t exist and won’t ever appear. A few weeks from now an edge case can come to bite you. Look for opensource frameworks and libraries that can do what you need. Learn to appreciate the work from people that have been polishing these pieces of software for years - and allowing you to use it for free. Maybe you can even do a significant contribution to make them better.

    Gerald Sussman, the legendary MIT professor who co-authored the SICP book, was once asked about the switch from Scheme (a Lisp dialect) to Python in the Computer Science undergraduate program. His answer was that it made sense, because these days programming is very different from what it was in the 80s and 90s. Today it’s “more like science”, where you grab some libraries and figure if they can do what you want. So, stand on the shoulder of giants and only write from scratch what you really need.

    Opt for rock-solid battle-tested solutions

    This one is related to “not reinventing the wheel”, but it’s more about choosing mature implementations that have a greater chance to work. Sometimes you may be tempted to pick this new framework that everyone is talking about (some of them without having ever touched it), but maybe it is not really ready to be used. Of course someone will have to start using it to prove if it works or not, but you probably don’t want to face the problems these early adopters will hit. At least not on a production environment.

    The same can be said when you need, for instance, a network protocol. Using raw sockets can be fast and fun, but when you application grows a little bit you’ll realize you need a real protocol. Then you will be implementing compression to trade data efficiently, defining a format to receive arguments, etc. In the end, something like GRPC was the answer of all the problems you were trying to solve, but what you got is a stripped-down version of it that wasn’t tested by thousands of other developers around the globe.

    Closing thoughts

    I could list a few more things about this subject, but it’s better if we keep this post short. Unfortunately, I’ve experienced some of the mentioned problems more than once in the last couple months. I’m not an expert in managing people, but one of the causes seems to be closely related to ego reasons. Sometimes I think that this may be naivety, when the person in question can’t see the broader picture of the problem they are facing. At the same time, this is funny because it also happens with multiple-year experienced individuals.

    If you are a developer who identified with at least part of this list, you don’t really need to listen to me. Think about yourself, as a professional in our area, and what you can do to write software that is easy and robust to be executed in different environments. Always remember that your responsibility doesn’t end when you push code in commits to a repository. You are at least responsible as I am for running your code in production. In the end, it’s your creation, right?

Subscribe via RSS