Clojure Weekly - issue #3
Clojure Weekly - issue #3
Clojure’un dinamik ve esnek yapısı sayesinde JVM (Java Virtual Machine) üzerinde web uygulamaları geliştirmek bir hayli hızlı ve eğlenceli. Yazının sonunda Clojure ile ilgili birçok kaynağı sağladım.
Terminalde projeyi kurmak istediğiniz dizine gidin ve alttaki komutu çalıştırınız. (Bendeki dizin ~/IdeaProjects)
lein new app clj-web-app
İşlem bittikten sonra Leiningen birçok dosya ve klasör oluşturacaktır, project.clj dosyasını açıp görseldeki gibi eklemeleri yapalım.https://medium.com/media/732a4b570175b5b4633a32c373186a0f/href
Şimdi /src/clj_web_app/core.clj dosyasına gidelim ve aşağıdaki kodu ekleyelim.https://medium.com/media/47027b71c70126efc622ecc48a573e91/href
Uygulamayı çalıştırmak için proje dizininde alttaki komutu çalıştırın
Uygulamamız http://localhost:3000'de ayağa kalmış olacaktır.
Bir sonraki bölümde Middleware konusunu ele alıp, web uygulamamızı bir sonraki seviyeye taşıyacağım. Sıkılmayıp yazımı buraya kadar takip ettiğiniz için teşekkür ederim.
Örnekteki koda GitHub reposundan erişebilirsiniz.
I’ve had a pleasure to work with Piotrek Jagielski for about two weeks on Clojure project. I’ve learned a lot, but there is still a lot to know about Clojure for me. In this post I’ll write what fascinated, disappointed and astonished me about this programming language.
Before you start your journey with Clojure:
For many people Clojure brackets are reasons to laugh. Jokes like that were funny at first: “How many brackets did you write today?”
I have to admit, that at the beginning using brackets was not easy for me. Once I’ve realized that the brackets are just on the other side of the function name, everything was simple and I could code very fast.
After few days I’ve realized that this brackets structure forces me to think more about the structure of the code. As a result the code is refactored and divided into small functions.
Clojure forces you to use good programming habits.
Clojure is homoiconic, which means that the Clojure programs are represented by Clojure data structures. This means that when you are reading a Clojure code you see lists, maps, vectors. How cool is that! You only have to know few things and you can code.
Because Clojure code is represented as data structures, you can pass data structure (program) to running JVM. Furthermore, compiling your code to bytecode (classes, jars) may be eliminated.
For example, when you want to test something you are not obligated to start new JVM with tests. Instead you can just synchronize your working file with running REPL and run the function.
Traditional way of working with JVM is obsolete.
In the picture above, on the left you can see an editor, on the right there is running REPL.
The same way you can run tests, which is extremely fast. In our project we had ~80 tests. Executing them all took about one second.
Simplicity is the ultimate sophistication.
After getting familiar with this language, it was really easy to read code. Of course, I was not aware of everything what was happening under the hood, but consistency of the written program evoked sense of control.
When data structure is your code, you need to have some additional operators to write effective programs. You should get to know operators like ‘->>’, ‘->’, ‘let’, ‘letfn’, ‘do’, ‘if’, ‘recur’ …
Even if there is a good documentation (e.g. Let), you have to spend some time on analyzing it, and trying out examples.
As the time goes on, new operators will be developed. But it may lead to multiple Clojure dialects. I can imagine teams (in the same company) using different sets of operators, dealing with the same problems in different ways. It is not good to have too many tools. Nevertheless, this is just my suspicion.
I’ve written a function that rounds numbers. Despite the fact that this function was simple, I wanted to write test, because I was not sure if I had used the API in correct way. There is the test function below:
(let [result (fixture/round 8.211M)]
Unfortunately, tests were not passing. This is the only message that I received:
Great. There is nothing better than a good exception error. I’ve spent a lot of time trying to solve this, and solution was extremely simple.
My function was defined with
defn-, instead of
defn- means private scope and test code, could not access testing function.
Assertions can be misleading. When tested code does not work properly and returns wrong results, error messages are like this:
ERROR in math-test/math-operation-test (RT.java:528)
I hadn’t got time to investigate it, but in my opinion it should work out of the box.
It is a matter of time, when tools will be better. Those problems will slow you down, and they are not nice to work with.
The Clojure concurrency impressed me. Until then, I knew only standard Java synchronization model and Scala actors model. I’ve never though that concurrency problems can be solved in a different way. I will explain Clojure approach to concurrency, in details.
The closest Clojure’s analogy to the variables are
vars, which can be created by
(defn a01 
We also have local variables which are only in
let scope. If we re-define scope value of amount, the change will take place only in local context.
(defn a02 
The following will print:
Nothing unusual. We might expect this behavior.
The whole idea of concurrent access variables can be written in one sentence. Refs ensures safe shared access to variables via STM, where mutation can only occur via transaction.
Let me explain it step by step.
Refs (reference) is a special type to hold references to your objects. As you can expect, basic things you can do with it is storing and reading values.
STM stands for Software Transactional Memory. STM is an alternative to lock-based synchronization system. If you like theory, please continue with Wikipedia, otherwise continue reading to see examples.
(defn a03 
In the second line, we are creating reference. Name of this reference is
amount. Current value is
In the third line, we are reading value of the reference called
amount. Printed result is 10.
(defn a04 
Using ref-set command, we modify the value of the reference amount to the value 100. But it won’t work. Instead of that we caught exception:
IllegalStateException No transaction running clojure.lang.LockingTransaction.getEx (LockingTransaction.java:208)
(defn a05 
To modify the code we have to use
dosync operation. By using it, we create transaction and only then the referenced value will be changed.
The aim of the previous examples was to get familiar with the new operators and basic behavior.
Below, I’ve prepared an example to illustrate bolts and nuts of STM, transactions and rollbacks.
Imagine we have two references for holding data:
source-vectorcontaining three elements: “A”, “B” and “C”.
Our goal is to copy the whole source vector to destination vector. Unfortunately, we can only use function which can copy elements one by one -
Moreover, we have three threads that will do the copy. Threads are started by the
Keep in mind that this is probably not the best way to copy vectors, but it illustrates how STM works.
(defn copy-vector [source destination]
Below is the output of this function. We can clearly see that the result is correct. Destination vector has three elements. Between
Sucessful write messages we can see that there are a lot of messages starting with
Trying to write.
What does it mean? The rollback and retry occurred.
Each thread started to copy this vector, but only one succeed. The remaining two threads had to rollback work and try again one more time.
Thread A (red one) wants to write variable, it notices that the value has been changed by someone else - conflict occurs. As a result, it stops the current work and tries again whole section of
dosync. It will try until every write operation succeed.
dosyncsection has to be pure, without side effects. For example you can not send email to someone, because you might send 10 emails instead of one.
There is a lot that Java developers can gain from Clojure. They can learn how to approach the code and how to express the problem in the code. Also they can discover tools like STM.
If you like to develop your skills, you should definitely experiment with Clojure.
Recently, I’ve begun my adventure with Clojure programming language. As a result I’ve decided to share gathered knowledge and my opinion, in this and in few upcoming posts.
I had to implement algorithm that depends on the current date. Core information for this algorithm is number of days between current date and some date in the future, expressed in days.
Therefore, there is a call somewhere in the code:
(. java.time.LocalDate now)
For the tests to be stable, I had to make sure that this call always return the same day.
I’ve decided to extract creation of the current date functionality to the function:
(defn now-date  (. java.time.LocalDate now))
During tests I’ve declared different function:
(defn fixed-date  (. java.time.LocalDate of 2015 01 28))
Passing function that creates a current date, solved the problem. It worked great, but it had the following disadvantages:
Having a function, that returns a current time, I’ve decided to find a way of overwriting its definition in tests. I’ve found out that there is operation called
with-redefs-fn, which allows re-defining the function temporarily in the local context. Having defined fixed-date function, block of code looks like this:
fixture/now-date is a reference to function that I wanted to replace. This time I was amazed by language possibilities. But there was one more problem to solve. I did not want to use java notation.
There is a library called Clj-time. It wraps Joda Time library and makes Clojure code more friendly. I wanted to hold on Java 8 library, but I did not see any alternatives.
So I replaced
(. java.time.LocalDate now) to
(t/now) and also creation of fixed dates, and then I came up with an idea.
Maybe should I replace the Clj-time itself? My production code will be simpler and the test code will be simpler too!
This is my final solution. I am still impressed how easily it that can be done.
I use Clojure for a week. If you have any other ideas how to solve this problem comment, let me know.
Thanks so much for your support and feedback in the latest survey.
Our call for proposals for new projects will close on Thursday, April 9th, 2020 at 11:59pm PST.
This time around, we’re adding two new questions to our application to help us pick projects:
Note that if you haven’t been impacted by either of these, you are still eligible to receive Clojurists Together funding.
We’ll be funding four projects $9,000 each over three months ($3,000/mo). This is ideally May-July, but we’re able to be a little bit flexible on the start timing.
There were 60 respondents to the survey, down from 77 in the last survey. The highlights are presented below.
The main things our members were interested in:
If you work on any of these kinds of projects, please look at applying for funding.
If you’re a maintainer of any of these projects, please consider applying.
Our members also mentioned GraalVM support as something they’d like to see improved.
A sampling of comments:
For company members:
For developer members:
Could do better:
not capable of or susceptible to change
Problems with immutability, or the lack thereof, isn’t a new thing. It has been around for as long as there have been programming languages.
For example, any Python developer knows that the mutable and immutable data types in Python are causing a lot of headaches — even to advanced developers. Consider this example:
>>> foo = ['hello'] >>> print(foo) ['hello'] >>> bar = foo >>> bar += ['world'] >>> print(foo) ['hello', 'world'] **<-- WHAT IS HAPPENING?**
What is happening? Because foo was never modified directly, but bar was, anyone would expect the following:
>>> print(foo) ['hello'] **<-- WHY IS THIS NOT HAPPENING?**
But that isn’t happening, and that is mutability at work.
In Python, if you assign a variable to another variable of a mutable data type, any changes are reflected by both variables. In this example bar = foo.
The new variable bar is just an alias for foo.
Put simply, mutable means ‘can change’ and immutable means ’cannot change.’
Languages such as Rust, Erlang, Scala, Haskell, and Clojure offer immutable data structures and single assignment variables with the premise that immutability leads to better code, simpler to understand, and easier to maintain.
That’s all beautiful, Adrian, but what does it have to do with software architectures?
Headaches… and no one likes headaches.
The immutable infrastructure paradigm comes from the same ideas behind immutability in programming languages — your architecture doesn’t change once deployed, and it saves you from headaches.
Let’s dive into it!
Uptime, according to Wikipedia, “is a measure of system reliability, expressed as the percentage of time a machine, typically a computer, has been working and available. Uptime is the opposite of downtime.”
Not so long ago, it was common for folks to brag about uptime.
The problem, though, is that long periods of uptime often indicates potentially lethal problems because critical updates, either software or hardware, often require rebooting.
Let me ask you this simple question — which of the below makes you more anxious?
(1) Rebooting a server that’s been up for 16 years.
(2) Rebooting a server that’s just been built.
That’s right, me too. Just writing it gives me anxiety. But why?
I have no idea what has changed in the 16 years since its initial boot up!
How many undocumented hotfix were done? Why was it never rebooted? Are there some hidden and magic dependencies?
To me, that’s terrifying.
Old servers can become time-bombs. We start with a hotfix on a Friday night before the weekend.
We say: “Just a quick edit to the config file to save the weekend.”
We promise: “I will document and automate the fix first thing Monday morning.”
But we never do.
One hotfix after the other, and we end up with a dangerous drifting time-bomb ready to blow up our production environment.
In a traditional IT infrastructure, servers are typically updated and modified in place — SSH-ing into servers and installing, updating or upgrading packages, tweaking, and optimizing configuration files. All these were, and often still are, standard practices. Less common practice, though, is to document all these changes.
These practices make servers mutable ; they change after they are created.
Besides giving you headaches, mutations also lead to configuration drifts , which happens when active configurations diverge from the initially approved and deployed configuration.
One of the main challenges with configuration drifts is that they make it hard to “just replace” or debug things — making recovering from issues difficult.
Configuration drifts often were, and still are, a consequence of low automation and manual processes. Back then, deployment tools were generally expensive and not as available as they are today. They required deep expertise, and thus automation wasn’t a priority.
Don’t fear what you know about your architecture; fear what you don’t know!
Remember that failures happen all the time, so one of the most important things you can optimize is your recovery time from failures — or MTTR.
“Failures are a given, and everything will eventually fail over time.” — Werner Vogels, CTO at Amazon.
Moreover, if your tests and validations occur in an environment in which configurations don’t match those of your production environment, well, it’s useless.
Another common practice in traditional infrastructure is mutable deployments. Meaning the deployment pipeline gets the code, fetches the dependency, builds the artifacts, and deploys them for every environment stage of the deployment pipeline. One build per environment.
What you test and validate isn’t what you deploy!
I’ve seen places where deploying to production meant first launching any instance (“any” since it wasn’t the same between environments), fetching the code from GitHub, building it, replacing the artifacts in-place — and finally — rebooting the server. One step was missing, though — pray.
I’m sure you are all either smiling or are horrified by the idea — but think about the following:
How many of your applications fetch dependencies from the internet during the build process, or worse, at runtime?
>>> pip install -r requirements.txt >>> npm install >>> docker build
What guarantees do you have that the dependencies you are installing don’t change between deployments? And what if you are deploying to multiple machines or environments?
Don’t you believe me?
Again, what you test and validate isn’t what you deploy!
Finally, installing dependencies at runtime is also an attack vector for malicious code injection and it renders auto-scaling slow (more on this below).
That deployment behavior is fragile at best and leads to deployment mistakes and frequent rollbacks. And that’s if you’re lucky.
Luckily, there’s a solution!
“This your last chance. After this, there is no turning back. You take the blue pill, the story ends. You wake up in your bed and believe whatever you want to. You take the red pill, you stay in Wonderland, and I show you how deep the rabbit hole goes. Remember, all I’m offering is the truth. Nothing more.” — Morpheus, The Matrix (1999)
Put simply, immutable infrastructure is a model in which no updates, security patches, or configuration changes happen “in-place” on production systems. If any change is needed, a new version of the architecture is built and deployed into production.
The most common implementation of the immutable infrastructure paradigm is the immutable server. It means that if a server needs an update or a fix, new servers are deployed instead of updating the ones already used. So, instead of logging in via SSH into the server and updating the software version, every change in the application starts with a software push to the code repository, e.g., git push. Since changes aren’t allowed in immutable infrastructure, you can be sure about the state of the deployed system.
Deploying applications in immutable infrastructures should use the canary deployment pattern. Canary deployment is a technique used to reduce the risk of failure when new versions of applications enter production by creating a new environment with the latest version of the software. You then gradually roll out the change to a small subset of users, slowly making it available to everybody if no deployment errors are detected.
Immutable infrastructures are more consistent, reliable, and predictable ; and they simplify many aspects of software development and operations by preventing common issues related to mutability.
The term “immutable infrastructure” was coined by Chad Fowler in his blog post “Trash Your Servers and Burn Your Code: Immutable Infrastructure and Disposable Components,” published in 2013. Since then, the idea has — rightly so — gained popularity and followers , especially as systems have become more complex and distributed.
Some of the benefits include:
(1) Reduction in configuration drifts — By frequently replacing servers from a base, known and version-controlled configuration, the infrastructure is reset to a known state , avoiding configuration drifts. All configuration changes start with a verified and documented configuration push to the code repository, e.g., git push. Since no changes are allowed on deployed serves, you can remove SSH access permanently. That prevents manual and undocumented hotfixes, resulting in complicated or hard-to-reproduce setups, which often lead to downtime.
(2) Simplified deployments — Deployments are simplified because they don’t need to support upgrade scenarios. Upgrades are just new deployments. Of course, system upgrades in immutable infrastructure are slightly slower since any change requires a full redeploy. The key here is pipeline automation!
(3) Reliable atomic deployments — Deployments either complete successfully, or nothing changes. It renders the deployment process more reliable and trustworthy, with no in-between states. Plus, it’s easier to comprehend.
(4) Safer deployments with fast rollback and recovery processes — Deployments using canary patterns are safer because the previous working version isn’t changed. You can rollback to it if deployment errors are detected. Additionally, since the same process to deploy the new version is used to rollback to older versions, it makes the deployment process safer.
(5) Consistent testing and debugging environments — Since all servers running a particular application use the same image, there are no differences between environments. One build deployed to multiple environments. It prevents inconsistent environments and simplifies testing and debugging.
(6) Increased scalability — Since servers use the same base image, they are consistent and repeatable. It makes auto-scaling trivial to implement, significantly increasing your capacity to scale on-demand.
(7) Simplified toolchain — The toolchain is simplified since you can get rid of configuration management tools managing production software upgrades. No extra tools or agents on servers. Changes are made to the base image, tested, and rolled-out.
(8) Increased security — By denying all changes to servers, you can disable SSH and remove Shell access to servers. That reduces the attack vector for bad actors, improving your organization’s security posture.
Let’s take a look at an essential aspect of immutable infrastructure — the immutable server.
An immutable server, or golden image, is a standardized template for your application server.
Typically, the golden image starts from a base image from which you remove unnecessary packages, harden it and apply security patches. Often it also includes agents for monitoring and tools for shipping logs out of the instance, does security audits, and performance analysis.
Using golden images ensures you have consistent, reviewed, tested, and approved images for use within your organization.
As you identify potential new vulnerabilities in the software, you can update the golden image with the appropriate security patches, test the image, and deploy the newly created golden image to your environment.
Manually updating golden images is time-consuming and error-prone. Therefore, you should automate the creation of golden images with open source tools like Packer, Netflix Animator, or AWS tools such as EC2 Image Builder.
Once automated, you can build standardized and repeatable processes to:
These processes help improve the operation and security posture of organizations by limiting the attack surface of bad actors and mitigating risk.
EC2 Image Builder simplifies the creation, maintenance, validation, sharing, and deployment of both Linux and Windows Server images on EC2 or on-premises. It provides built-in automation, validation, and AWS-provided security settings for keeping images up-to-date and secure. It also enables version-control of your images for easy revision management. In other words, it is perfect for building golden images as it simplifies many of the manual processes required.
(1) Include as many dependencies as possible in your golden images. It will give you the most confidence that the image tested is what is deployed to production. It will also improve the scalability of your application (more on scalability below).
(2) Some configurations need to occur at the application start time. Consider service discovery mechanisms to help, or simply build a mechanism around the AWS metadata URI.
Connect to AWS EC2 metadata URI (http://169.254.169.254/latest/dynamic/) and get the instanceID. From the instanceID, query the instance tags, e.g “config location”, and get the configuration from the value of the tag.
For more information on using the metadata URI, check out this brilliant session from re:Invent 2016, Life Without SSH.
(3) Since immutable servers mean you can’t update system configuration in place and need to redeploy to apply changes to the configuration, consider decoupling physical server addresses from their roles. Using Amazon Route 53 with a private hosted zone and DNS within Amazon VPC is perfect for that.
(4) Building a stateless application is mandatory since any server can get terminated and rebuilt, and it should happen without any loss of data. If you need stickiness, keep it to the bare minimum. In a stateless service, the application must treat all client requests independently of prior requests or sessions and should never store any information on local disks or memory. Sharing state with resources in an auto-scaling group should be done using in-memory object caching systems such as Memcached, Redis, or EVCache, or distributed databases like Cassandra or DynamoDB, depending on the structure of your object and requirements in terms of performance. You can also use a shared file system such as Amazon EFS, a fully managed elastic NFS file system.
(5) On EC2 instances, always opt for mountable storage devices like Amazon EBS volumes, which can be attached to new servers when old ones are terminated. You can use EBS Snapshots with automated lifecycle policies to back up your volumes in Amazon S3, ensuring the geographic protection of your data and business continuity. EBS Snapshots can also be attached quickly to forensic EC2 instances if necessary.
Note: At the application design phase, make sure you separate ephemeral from persistent data and ensure you are persisting only the data you need so that you can avoid unnecessary costs. It also dramatically simplifies operations.
(6) Ship logfiles off EC2 instances and send them to a central log server. Logstash, ElasticSearch, and Kibana are tools widely used for this purpose. The ELK stack is also available as a managed AWS service.
The speed at which you can tolerate auto-scaling will define what technology you should use. Let me explain:
I started using auto-scaling with EC2 like most early adopters (ten years ago there wasn’t a choice). The first mistake I made was to launch new instances with initialization scripts to do the software configuration whenever a new instance launched. It was very slow to scale and extremely error-prone.
Creating golden images enabled me to shift the configuration setup from the instance launch to an earlier ‘baking’ time built within the CI/CD pipeline.
It is often difficult to find the right balance between what you bake into an image and what you do at scaling time. If you’re smart with service discovery, you shouldn’t have to run or configure anything at startup time. It should be the goal because the less you have to do at startup time, the faster your application will scale up. In addition to being faster at scaling up, the more scripts and configurations you run at initialization time, the higher the chance that something will go wrong.
Finally, what became the holy grail of golden image was to get rid of the configuration scripts and replace them with Dockerfiles. Testing became a lot easier as the container running the applications was the same, from the developer’s laptop to production.
It is safe to say that containers have today become the standard method to package and run applications. Yet, similarly to applications running on EC2 instances, applications running on containers are prone to the same issues : updating, drift, overhead, and security.
Bottlerocket reflects much of what we have discussed so far:
First, it includes only the software essential to running containers, which significantly reduces the attack surface compared to general-purpose operating systems.
Also, it uses a primarily read-only file system that is integrity-checked at boot time via dm-verity. SSH access is discouraged and is available only as part of a separate admin container that you can enable on an as-needed basis for troubleshooting purposes.
Bottlerocket doesn’t have a package manager, and updates are applied and can be rolled back in a single atomic action , reducing potential update errors. It has built-in integrations with AWS services for container orchestration, registries, and observability, making it easy to start with and deploy.
Finally, Bottlerocket supports Docker image and images that conform to the Open Container Initiative (OCI) image format.
In summary, Bottlerocket is a minimal OS with an atomic update and rollback mechanism which gives limited SSH access. In other words, it’s a perfect fit for our immutable infrastructure model.
NO KEYS on server = better security!
SSH turned OFF = better security!
The beauty of immutable infrastructure is that it not only solves many of the issues discussed thus far, but it also transforms security. Let me explain.
Mutability is one of the most critical attack vectors for cyber crimes.
When a bad actor attacks a host, most of the time it will try to modify servers in some way — for example, changing configuration files, opening network ports, replacing binaries, modifying libraries, or injecting new code.
While detecting such attacks is essential, preventing them is much more important.
In a mutable system, how do you guarantee that changes performed to a server are legitimate or not? Once a bad actor has the credentials, you simply can’t know any more.
The best strategy you can leverage is immutability, and simply deny __all__ changes to the server.
A change means the server is compromised, and it should either be quarantined, stopped, or terminated immediately.
That is is DevSecOps at its best! Detect. Nuke. Replace.
Extended the idea to cloud operations; the immutability paradigm lets you monitor any unauthorized changes happening in the infrastructure.
Doing this on AWS Cloud means detecting changes using AWS CloudTrail and AWS Config, alerting via SNS, remediating with AWS Lambda, and replacing with AWS CloudFormation.
“Your goal is to raise the cost of attacks, ideally beginning at design” — Controlled chaos: The inevitable marriage of DevOps and security, Kelly Shortridge.
It is the future for security in the cloud, and you should embrace it now.
To support application deployment in immutable infrastructure, you should use an immutable deployment pattern. My favorite is the canary deployment. This is a technique used to reduce the risk of failure when new versions of applications enter production, by gradually rolling out the change to a small subset of users and then slowly rolling it out to the entire infrastructure and making it available to everybody. Canary deployment is sometimes called a phased or incremental rollout.
According to Kat Eschner, the origin of the name canary deployment comes from an old British mining tradition where miners used canaries to detect carbon monoxide and toxic gases in coal mines. To ensure mines were safe to enter, miners would send in canaries first, and if the canary died or got ill, the miners would evacuate.
Using canary deployment, you deploy a new version of the application progressively, by increments, starting with a few users. If no errors are detected, the latest version can gradually roll out to the rest of the users. Once the new version is deployed to all users, you can slowly decommission the old version. This strategy minimizes the potential blast-radius of failure, limiting the impact on customers. It is, therefore, preferred.
The benefit of canary deployment is, of course, the near-immediate rollback it gives you — but more importantly, you get fast and safer deployments with real production test data.
The main challenge with canary deployment is routing traffic to multiple versions of the application. Consider several routing or partitioning mechanisms:
** By keeping canary traffic selection random, most users aren’t adversely affected at any time by potential bugs in the new version, and no single user is adversely affected all the time.
Route 53 lets you use a weighted routing policy to split the traffic between the old and the new version of the software you are deploying. Weighted routing enables you to associate multiple resources with a single domain name or subdomain name and choose how much traffic is routed to each resource. It is particularly useful for canary deployments.
To configure weighted routing for canary deployment, you assign each record a relative weight that corresponds with how much traffic you want to send to each resource. Route 53 sends traffic to a resource based on the weight that you assign to the record as a proportion of the total weight for all records.
For example, if you want to send a tiny portion of the traffic to one resource and the rest to another resource, you might specify weights of 1 and 255. The resource with a weight of 1 gets 1/256th of the traffic (1/1+255), and the other resource gets 255/256ths (255/1+255).
This is my favorite implementation of canary deployment as the traffic split starts from the very top of the architecture — DNS. To me, it is easier to understand and to implement.
If there’s one inconvenience to using DNS, it’s propagation time. From the Route 53 FAQs:
Amazon Route 53 is designed to propagate updates you make to your DNS records to its worldwide network of authoritative DNS servers within 60 seconds under normal conditions. A global change is successful when the API call returns an INSYNC status listing.
Note that caching DNS resolvers are outside the control of the Amazon Route 53 service and will cache your resource record sets according to their time to live (TTL). The INSYNC or PENDING status of a change refers only to the state of Route 53’s authoritative DNS servers.
So, watch out for the default TTL values, and shorten them.
For this method, you need various ASGs behind a Load Balancer. You can then update one of the ASG with the new golden image — incorporating the latest version of the software — and replace the current instance with the new one using rolling-updates. Rolling-updates launch the new instance first. Once “healthy” with the load balancer, it drains the connections with the old instances and terminates them progressively. It is, therefore, a safe deployment method, and with no downtime.
For example, if your application needs a minimum of seven instances to run smoothly, you can use one ASG with six instances (two per AZ), and a second ASG with only one instance. You can then update the instance in the second ASG with the latest golden image. It becomes your canary, with 1/7th of the traffic.
You can then slowly increase the number of instances in that second group and progressively reduce the number of instances in the first ASG. In that case, your increment is roughly 14%. Naturally, the larger the number of instances in the first ASGs, the smaller the increment. Rolling back is straightforward; simply increase the maximum number of instances in the first ASG and remove the second ASG, or update it with the old image.
This method is easy to automate since CloudFormation supports the UpdatePolicy attribute for ASGs with AutoScalingRollingUpdate policy.
Both the strategies mentioned above can cause delays due to DNS TTL caching or ASG updates and introduce additional costs. Fortunately, AWS recently announced weighted target groups for application load balancers, which lets developers control the traffic distribution between multiple versions of their application.
When creating an Application Load Balancer (ALB), you create one or more listeners and configure listener rules to direct the traffic to one target group. A target group tells a load balancer where to direct traffic to, e.g., EC2 instances, Lambda functions, etc.
To do canary deployment with the ALB, you can use forward actions to route requests to one or more target groups. If you specify multiple target groups for forward action, you must specify a weight for each target group.
Each target group’s weight is a value from 0 to 999. Requests that match a listener rule with weighted target groups are distributed to these target groups based on their weights. For example, if you specify two target groups, one with a weight of 10 and the other with a weight of 100, the target group with a weight of 100 receives ten times more requests as the other target group.
If you require session stickiness, you can enable target group stickiness for a particular rule. When the ALB received the first request, it generates a cookie named AWSALBTG that encodes information about the selected target group, encrypts the cookie, and includes the cookie in the response to the client. To enable stickiness, the client needs to add that cookie in subsequent requests to the ALB.
Similar to standard applications, with serverless, you need to take a conservative and pessimistic approach to deployment. So, instead of completely replacing the APIs or Lambda function with a new version, you need to make the new version coexist** with the old stable one and validate its performance and robustness gradually during the deployment. You need to split the traffic between two different versions of your APIs or functions.
** Decommission obsolete or insecure Lambda functions as new vulnerabilities are discovered.
For your serverless applications, you have the option of using Amazon API Gateway since it supports canary release deployments.
Using canaries, you can set the percentage of API requests that are handled by new API deployments to a stage. When canary settings are enabled for a stage, API Gateway will generate a new CloudWatch Logs group and CloudWatch metrics for the requests handled by the canary deployment API. You can use these metrics to monitor the performance and errors of the new API and react to them. You can then gradually increase the percentage of requests handled by the new API deployment, or rollback if errors are detected.
With alias traffic shifting you can implement canary deployments of Lambda functions. Simply update the version weights on a particular alias, and the traffic will be routed to new function versions based on the specified weight. You can easily monitor the health of that new version using CloudWatch metrics for that alias and rollback if errors are detected.
Changing aliases’ weights and checking the behavior of the newly deployed functions should, of course, be automated. Fortunately, AWS CodeDeploy can help as it can automatically update function alias weights based on a predefined set of preferences and automatically rollback if needed.
Out of the box, both give you:
Here is a shortlist to help you make sure you are ready with immutable infrastructure.
☑ Infrastructure as code
☑ Outstanding monitoring for applications and deployments
☑ Centralize logging
☑ Deployment automation using immutable deployment pattern (canary)
☑ Golden image creation and update pipeline
☑ Processes to assess, decommission, and distribute golden images
☑ Stateless applications layer
☑ Beware of temporary, transient state
☑ Using snapshot for volumes
☑ Using databases, object or in-memory data-stores for data
☑ SSH turned off
☑ No SSH keys on servers
If you like the idea of immutability, reducing vulnerability surface area, increasing the speed of deployment, and minimizing configuration drift, you’ll probably love unikernels.
The idea of using unikernels in the cloud isn’t new. It was first introduced by Anil Madhavapeddy in 2013: Unikernels: Library Operating Systems for the Cloud — (Madhavapeddy et al., 2013)
“Unikernels are specialized, single-address-space machine images constructed by using library operating systems.” (Unikernel.org)
Unikernels are restructured VMs, designed to have components that are more modular, flexible, secure, and reusable, built in the style of a library OS. They are specialized OS kernels written in a high-level language and act as individual software components. A full application may consists of multiple running unikernels working together as a distributed system.
The most exciting part is that unikernels compile application code together with the operating system, without the thousands of unnecessary drivers OS usually included. The compiled image is minimalist.
The main benefits are:
(1) Fast startup time: Since unikernels are minimalist and lightweight, they boot extremely fast, within milliseconds.
(2) Improved security: Since unikernels only compile what it needs, the number of drivers and configuration code deployed is reduced, which in turn minimizes the attack surface, thus improving security.
(3) Less prone to configuration drifts: Since unikernels are compiled directly into specialized machine images, you can’t SSH your way into it like in the good old days. Thus, you can’t easily hotfix them in production, preventing configuration drifts.
I think unikernels have a big part to play in the future of immutability in the cloud. So, keep an eye on the technology!
That’s all for now, folks. I hope you’ve enjoyed this post. Please don’t hesitate to share your feedback and opinions. Thanks a lot for reading :-)
Can you possibly be good at everything? In the digital era, the sheer quantity of tools, frameworks, programming languages, and methods/models overwhelms the brain. Technologies come and go, and learning a new technology often involves intense effort. Learning a new environment and domain knowledge takes time. When asking engineers how long it takes for a new hire to get up to speed, a common response is six to twelve months.
It takes years of hard work to become a master of anything, and DevOps skills are no different. Kubernetes, for example, is famous for its complexity and relative ease for junior developers to botch up in unpredictable ways. If you are drinking through the firehose while learning new domain knowledge, you might also be deep into learning a new language like Clojure, utilizing new cloud APIs, applying the Big-O notation to new algorithms, and mastering machine learning and server-less functions. And that just scratches the surface of technology that continues to accelerate and expand. How do you intend to keep up with the growing stack?
Mastering one skill by default means you will intentionally ignore other skills to focus on the one you want to master. The brain can only hold so much. Our human working memory is vulnerable to overload, which occurs as we study increasingly complex subjects and perform increasingly complex tasks.
Cognitive overload can result in lower quality, heroics, burnout and health problems.
One prime complaint from engineers is the consistent interruptions that prevent individuals from completing work during the day. If you can’t do your most important work during business hours, when do you get your work done? The pressure to work during the wee hours of the night on top of one’s regular day job is strong.
Working long hours may come with trendy bragging rights, implying strength and power, but as The Wall Street Journal health writer Melinda Beck says, “Genuine ‘short sleepers’ only need four hours of sleep per night, but they comprise only 1% to 3% of the total population.” So for the 97% to 99% of us who aren’t short sleepers, working the wee hours brings sleep deprivation and mistakes—both are contributors to burnout.
Burnout is more than feeling blue. It is a chronic state of being out of sync at work, and it can be a significant problem in your life, including the following impacts:
Risk of burnout from information overload is real and leads to serious health issues:
Too much information puts our brain’s health in danger, resulting in an information overload. Over time, exposure to multiple sources of data leads to the overstimulation of the brain. Neurons get overloaded with data, numbers, deadlines, targets to be met, and projects to be completed, and all this unnecessary information can ultimately destroy them. Consequently, a stressed and overloaded brain is at high risk of dementia and other neurodegenerative disorders (Parkinson’s and Alzheimer’s diseases).
Expecting individuals, including ourselves, to be skilled in the “full stack” adds to the ongoing cognitive load that must be carried to get the job done. This can lead to burnout and even, in some cases, more serious health issues. But we need to be able to support the entire stack to help our businesses stay relevant and keep moving forward. What can we do? In the next section, we will look at some options to help support the entire stack while providing a healthy and humane work environment for our teams and ourselves.
The full-stack developer or engineer is extremely difficult to find, develop, and sustain. In cases where individuals are able to fulfill these roles, they place themselves at risk of cognitive overload and the related health risks. Additionally, the acceleration and evolution of the technology stack means that mastery is a moving target, demanding continual learning investment in multiple domains. However, our businesses need full-stack functionality to stay relevant, efficient, and nimble in the market. How do we meet that demand without overloading individuals?
A full-stack team has the combined skills across its members to effectively design, build, deploy, and operate software throughout all development cycles of their deliverables. Moving to full-stack teams helps us deliver the full-stack advantage to our organizations without the challenges of recruiting, developing, and sustaining full-stack developers or engineers.
Here are some recommended practices to help build full-stack teams:
To download the full guidance paper Full Stack Teams, Not Engineers, please sign up here.
So you know I have started a book, and I have announced it on dev.to:
Check the book page here: https://leanpub.com/adventuresofadeveloperteam/
Now I have decided that I would post the very beginning of this story, just to ask you opinions, that you like the characters, or not, etc. Any remark is welcome. It is a nonlinear story, but there are no other possible storylines until the meeting. So... here you are:
There is a company called UtopiaLabs based in Zürich, Switzerland. Is a startup founded about three years ago by two entrepreneurs and currently they employ about 50 people, so they are growing very fast!
They aim to build solutions to make the world a better place :) They have ongoing projects on making factories greener, building infrastructure to prevent polluting the oceans, research new methods to use more recycled materials with new buildings, etc.
They also believe in the power of great design and communication. By the way, this is their logo:
One of their projects is a platform that they hope to solve the problem of food waste in Zürich with.
The team is currently working on one of the main projects that will help solve the problem of food waste by connecting local companies and customers.
They collect and analyse data by quantifying food loss and waste, overproduction, out-of-date food, and uneaten meals by customers. Using the data, they recommend actions like use more semi-prepared food, improve meal forecasting, train staff, engage consumers, and how to achieve these.
Last month they released the MVP version, and since then a few users started using it. Users are pleased with the platform, but new requirements and a few minor bugs came up. They are planning to release the next version in 2 months. They are having a meeting today where they discuss the schedule, the deadlines, and so on.
There are six young professionals on the team: a lead developer, two senior developers, a junior developer, a manual tester and a manager. First, let's meet each member of the team!
So to tell the truth, everyone loves Chris. Team members love him because he helps them with anything they need help with. Managers love him because he makes excellent decisions and knows very well how to bring out the best in people. People love him because he is a nice person. :)
He has a Master's Degree in Computer Science and has been working in the industry for ten years now, and he has been with UtopiaLabs since its beginning. Previously he has been working in the financial sector at various multinational enterprises, so working with a startup is a bit new for him, but he loves it.
Anna is a senior backend developer, and she has been coding for 5-6 years. She learned the basics of programming in Java, and since then she became the master of a few other languages, too, but she loves JVM languages the most. She has some phobias of frontend development, because as she says "I don't have the patience to spend many hours pushing some pixels around to get the right ratios, spacing, contrasts and stuff..."
Anyway, she is a real hipster: she prefers to (always) bicycle or walk to work, drinks a lot of coffee, usually speciality coffee, she is a vegan and buys clothes from thrift shops!
Luke is also a senior developer, but he is the one usually "pushing pixels around". He is a very skilled frontend developer and aspiring UX/UI designer. He spent weeks to find and implement the best design for the platform because it is a crucial aspect. We all know that "great design can make your audience believe and invest in your business".
He is a bizarre guy, having a strict daily routine, working out at least five times a week, counting calories and eating strange meals, like porridge with two raw eggs for breakfast. Yuk! :P
Let me introduce the junior dev of the team: Adam. He is the youngest of all as he is a university student pursuing his Bachelor's Degree in Computer Science. Adam is smart and curious, enjoys learning new languages and technologies.
He is usually in a hurry because sometimes he has classes in the morning and also in the afternoon so he can spend time at work only between them. He started a few weeks ago as an intern, and he is usually fixing bugs while wearing the same outfit: jeans and T-shirts...
Everything the developers do will go to Amber. She is a manual tester, so her job is to find all the bugs. She wants to broaden her knowledge, so she wants to start an automation testing course soon.
Amber has one cat and three plants. Loves hiking and dancing :) He started working at UtopiaLabs a year ago. At the same time she is pursuing his Master's Degree in Mathematics, so she is precise and usually finds all the bugs... :) No bug can go through Amber!
Kelly, the manager, comes from a very different background. She has a Bachelor's Degree in Business Management, and she has been working as a project manager for a few years now.
She loves rock music and goes to concerts whenever she can, mainly in the summer. Her favourite bands are AC/DC, Led Zeppelin and The Beatles. When she finished university, she started learning how to play the guitar. It's something she wanted to do since she was a small child.
So now that you know the team you should get to today's meeting I mentioned earlier. Hurry, it starts soon! :)
Oh, you got here in time, as well as Amber, Kelly and Luke, but it seems that others are gonna be a few minutes late... as usual.
"So we have a short agenda for today: clarify the requirements and talk about the deadlines." - starts Kelly a few minutes later.
After getting to know the details of the requirements and estimating how many days each would take to develop and test, they realised that they would be ready three weeks earlier than the final deadline of the next release. It means that they need to decide what to do in those three weeks.
Chris takes advantage of the opportunity and recommends to spend the time learning about and trying out new technologies and methodologies, doing some experiments, etc. He thinks this knowledge would come handy when the platform receives more traffic, and the application will have to handle a lot of users and data.
However, Kelly recommends to instead go to courses paid by the company and so that every team member can learn whatever he or she wants. Anna would love to learn about Clojure, Adam would be happy for any course, but Amber really wants to start that automation testing course.
So here comes the very first decision: what will the team do?
-> go with Chris's idea and experiment with new technologies
-> go with Kelly's plan and attend courses
Errata: An earlier version of this post was misrepresenting conditions as exceptions, which has been addressed.
I have been reading Practical Common Lisp by Peter Seibel over the weekend, which is an excellent introduction to Common Lisp, showcasing its power by writing real programs. If you are interested in Lisp or programming languages at all, I recommend at least skimming it, it is free to read online.
Writing a Lisp-descended language professionally, and also living inside Emacs, I had dabbled in Common Lisp before, but I still found something I was not aware of, restarts. I do not think that this is a particularly well known feature outside the Lisp world, so I would like to spread awareness, as I think it is a particularly interesting take on error handling.
The book explains restarts using a mocked parser, which I will slightly modify for my example. Imagine you are writing an interpreter/compiler for a language. On the lowest level you are parsing lines to some internal representation:
(define-condition invalid-line-error (error) ((line :initarg :line :reader line))) (defun parse-line (line) (if (valid-line-p line) (to-ir line) (error 'invalid-line-error :line line)))
We define a condition, which is similar to an exception object with metadata in other languagesA “condition” in Common Lisp, as has been explained to me by Michał “phoe” Herda, is a way of signalling arbitrary events up the stack to allow running of additional code, not just signalling errors. They’re comparable to hooks in Emacs, but dynamically scoped to the current call stack.
, and a function which attempts to parse a single line.This is assuming of course that a line always represents a complete parsable entity, but this is only an example after all.
If it turns out that the line is invalid, it signals a condition up the stack. We attach the line encountered, in case we want to use it for error reporting.
Now imagine your parser is used in two situations: there is a compiler, and a REPL. For the compiler, you would like to abort at the first invalid line you encounter, which is what we are currently set up to do. But for the REPL, you would like to ignore the line and just continue with the next line.I’m not saying that is necessarily a good idea, but it is something some REPLs do, for example some Clojure REPLs.
To ignore a line, we would have to either do it on a low-level, return
nil instead of signalling and filter out
nil values up the stack. Handling the condition will not help us a lot, because at that point we have lost our position in the file already, or have we?
The next layer up is parsing a collection of lines:
(defun parse-lines (lines) (loop for line in lines for entry = (restart-case (parse-line line) (skip-line () nil)) when entry collect it))
This is where the magic begins. The
loop construct just loops over the lines, applies
parse-line to every element of the list, and returns a list containing all results which are not
nil. The feature I am showcasing in this post is
restart-case. Think of it this way: it does not handle a condition, but when the stack starts unwinding Technically not unwinding yet, at least not in Common Lisp.
because we signalled a condition in
parse-line, it registers a possible restart-position. If the condition is handled at some point,If it isn’t caught, you will get dropped into the debugger, which also gives you the option to restart.
the signal handler can choose to restart at any restart-point that has been registered down the stack.
Now let us have a look at the callers:
(defun parse-compile (lines) (handler-case (parse-lines lines) (invalid-line-error (e) (print-error e)))) (defun parse-repl (lines) (handler-bind ((invalid-line-error #'(lambda (e) (invoke-restart 'skip-line)))) (parse-lines lines)))
There is a lot to unpack here. The compiler code is using
handler-case, which is comparable to
catch in other languages. It unwinds the stack to the current point and runs the signal handling code, in this case
Because we do not actually want to unwind the stack all the way, but resume execution inside the
parse-lines, we use a different construct,
handler-bind, which automatically handles
invalid-line-error and invokes the
skip-line restart. If you scroll up to
parse-lines now, you will see that the restart clause says, if we restart here, just return
nil will be filtered on the very next line by
The elegance here is the split of signal handling code, and decisions about which signal handling approach to take. You can register a lot of different
restart-case statements throughout the stack, and let the caller decide if some signals are okay to ignore, without the caller having to have intricate knowledge of the lower-level code.It does need to know about the registered
restart-case statements though, at least by name.
If you want to learn more about this, make sure to have a look at the book, it goes into much more detail than I did here.
(import '(java.util UUID)) (defn random-uuid  (UUID/randomUUID))
random-uuid is just an equivalent for Clojure of what is available in ClojureScript under the same name.
(random-uuid) => #uuid"a3d3dafe-707f-4f52-abbd-be1d5f2dcb77"