Adam Raffe Cloud, Data Centre & Networking

Check Out my Azure 'Virtual Data Centre' Lab!

I’ve just finished working on a new self-guided lab that focuses on the Azure ‘Virtual Data Centre’ (VDC) architecture. The basic idea behind the VDC is that it brings together a number of Azure technologies, such as hub and spoke networking, User-Defined Routes, Network Security Groups and Role Based Access Control, in order to support enterprise workloads and large scale applications in the public cloud.

The lab uses a set of Azure Resource Manager (ARM) templates to deploy the basic topology, which the user will then need to further configure in order to build the connectivity, security and more. Once the basic template build has been completed, you’ll be guided through setting up the following:

  • Configuration of site-to-site VPN

  • Configuration of 3rd party Network Virtual Appliance (in this case a Cisco CSR1000V)

  • Configuration of User Defined Routes (UDRs) to steer traffic in the right direction

  • Configuration of Network Security Groups (NSGs) to lock down the environment

  • Azure Security Center for security monitoring

  • Network monitoring using Azure Network Watcher

  • Alerting and diagnostics using Azure Monitor

  • Configuration of users, groups and Role Based Access Control (RBAC)

I put this lab together to run as a workshop for the partners that I work with, but I am also making it available here for you to run if you are interested. It should take around 3 - 4 hours to complete and all you need is an Azure subscription (you can sign up for a free trial here)

The lab guide is available on Github here:

https://github.com/Araffe/vdc-networking-lab

Let me know what you think!

Adam

Say Hello to Azure Container Instances (ACI!)

It’s been some time since I last posted a blog - I’ve been spending the majority of my time settling in to my new role at Microsoft and learning as much as I can about the Azure platform. Considering my background, it seems fitting that my first “Azure related” blog post is all about….ACI! In this case however, ACI stands for Azure Container Instances - a new Azure offering announced this week and currently in preview. So what are Azure Container Instances?

Let’s first consider how we might run containers in the cloud today. Up until this point, if you wanted to run containers on Azure, you had two main options:

a) Provision your own virtual machines and run containers on top, or

b) Use Azure Container Service (ACS) - this service essentially allows you to quickly provision a cluster using either Kubernetes, DC/OS or Docker Swarm as the orchestration engine.

Both of these solutions suffer (to varying degrees) from the overheard of having to manage the underlying VM infrastructure and everything that goes along with that - maintenance, security, patching and so on. Wouldn’t it be nice if we could simply create our containers in the cloud without having to worry about the infrastructure they are running on? Look no further than Azure Container Instances.

ACI instances give us the ability to run containers easily from the command line or using ARM templates, with no need to create the underlying VMs. In addition, these container instances are billed _per second, _meaning they should work out to be extremely cost-effective. The official MS page for ACI is here.

To see how ACI works, let’s try it out. To start with, I’m going to create a single container using the Azure CLI. I’ll use the following command:

az container create -g ACI --name acitest --image nginx --ip-address public

This command is essentially asking for a container to be created under the resource group named “ACI”, with a name of “acitest”, using the public nginx image (which will be pulled from Docker Hub). The command also asks for a public IP address to be assigned to the container. The results of this can be seen in the following figure:

ACI1

The JSON output that results shows a wealth of information about the container - for example, I can see the amount of CPU and memory (1 CPU core, 1.5Gb of RAM - this is configurable), the IP address that has been assigned as well as some information about the resource group. You can also see that the provisioning state is “creating”. If I run the command _az container list _a few seconds later, I can see the container state is now “succeeded”:

ACI2

At this point, I can browse to the public IP address that Azure assigned to my container and I receive the default nginx page.

I can also get the log output from the container using the command az container logs:

ACI4

OK, so this is all very nice - but wouldn’t it be great if we could automate this process? And it would be even better if we could create multiple container instances at the same time. It is of course possible to do this using ARM templates. Check out the example template below (full template available here):

ACI5

The JSON shown above should do the following:

  • Create a new container group named “aciMultiGroup”.

  • Create two container instances - “aci-container1” and “aci-container2” as part of the group.

  • Expose port 80 from container 1 and assign a public IP address to the group.

This template introduces us to the container group concept. The idea behind this is that one or more containers with similar requirements or lifecycles can be deployed and managed together inside a container group (somewhat similar to the ‘pod’ concept in Kubernetes).

So let’s deploy this template and see what happens:

ACI6

OK, everything looks good. The output of the script has given me the public IP address of the container group - let’s try and browse to that:

ACI7

Success! Hopefully this post has given you a taster of what Azure Container Instances can do and the potential they have. Thanks for reading!

Moving on

A little over three years ago, I was introduced to Cisco’s Application Centric Infrastructure (ACI) for the first time. For someone who had spent many years at Cisco working on “traditional” networking platforms, this was something of a revelation - the ability to define network connectivity in a programmable manner using a policy based model was a major departure from anything I had done before. Since then, I’ve been lucky enough to work with a variety of customers around the globe, helping them to design and deploy the ACI solution. I’ve been part of a great team of people, worked closely with the INSBU team (responsible for ACI) and presented to hundreds of people at Cisco Live.

Over the last few months, I’ve spent some time thinking about what I do next: as ACI becomes more mainstream, do I continue with more of the same - expanding my skill set to include the other great products (Tetration, Cloudcenter,etc) that Cisco has in the data centre - or do I take a slightly different path? After some serious consideration, I’ve decided to go with the latter option - later this month, I’ll be joining Microsoft as a Cloud Solutions Architect, working with the Azure platform.

I’ve thoroughly enjoyed writing this blog over the last few years and want to thank everyone who has read the posts, commented or given me feedback. I’m hoping to continue blogging occasionally, so keep an eye out for the odd Azure-related post!

Adam

Learning ACI - Part 12: Inter-VRF and Inter-Tenant Communication

ACI has the ability to divide the fabric up into multiple tenants, or multiple VRFs within a tenant. If communication is required between tenants or between VRFs, one common approach is to route traffic via an external device (e.g. a firewall or router). However, ACI is also able to provide inter-tenant or inter-VRF connectivity directly, without traffic ever needing to leave the fabric. For inter-VRF or inter-tenant connectivity to happen, two fundamental requirements must be satisfied:

  1. Routes must be leaked between the two VRFs or tenants that need to communicate.

  2. Security rules must be in place to allow communication between the EPGs in question (as is always the case with ACI).

The question is, what is the correct way to configure this type of connectivity? I’ve seen quite a bit of confusion around this, particularly when it comes to deciding where to configure subnets (at the bridge domain level or EPG level), so hopefully I can provide a bit of clarity in this post. I’m going to cover three main scenarios:

  1. Inter-VRF communication, where there is a 1:1 mapping between bridge domains and EPGs within the VRFs.

  2. Inter-tenant communication, where there is a 1:1 mapping between bridge domains and EPGs.

  3. Inter-VRF communication, where multiple EPGs are associated with a single bridge domain in one or both of the VRFs.

Scenario 1: Inter-VRF Communication - 1:1 Mapping Between BD and EPG

In this example, we have a simple setup with two VRFs - each VRF has a single bridge domain (BD) and End Point Group (EPG) configured as shown in the following diagram:

scenario-1-1

At the beginning of this article, I mentioned that there were two requirements when configuring inter-VRF / inter-tenant communication: leaking of routes and security rules. ACI uses contracts to control both of these.

Let’s deal first with how we leak routes from the consumer side (EPG2 in our example) to the provider side (EPG1). To make this happen, we do the following:

  1. Configure our subnet at the BD level and ensure it is marked with Shared Between VRFs.

  2. Create a contract that defines the appropriate port protocols we wish to allow and make sure the scope is defined as Tenant.

  3. Attach the contract to the provider and consumer EPGs.

At this point, we have something like this configured:

scenario-1-2

As you can see, associating the contract with the provider and consumer EPGs within each VRF results in the consumer side BD subnet (172.16.1.0/24) being leaked to VRF A.

OK, so that’s great - we have routes leaked between VRFs in one direction (from consumer to provider). But clearly nothing is going to work until we leak routes in the opposite direction. So how do we get those routes leaked? This is the bit where a few people get tripped up.

The answer is that - on the provider side - we need to configure our subnets at the EPG level, rather than the bridge domain level (as is the case in ‘normal’ deployments that don’t involve route leaking). This is shown in the following diagram:

Scenario-1-3.png

There is a long technical explanation as to why this is necessary - I won’t bore you with that here, but fundamentally, the fabric has to be told which subnet belongs to which provider EPG, which is why the subnet must be configured at the EPG level rather than the BD level.

One point to note here is that you do not need to export contracts when configuring inter-VRF communication. This is only a requirement for inter-tenant communication.

Hopefully that’s clear so far - let’s move to our second scenario.

Scenario 2: Inter-Tenant Communication - 1:1 Mapping Between BD and EPG

In this scenario, we are going to configure communication between EPGs which sit in different tenants, as shown here:

scenario-2-1

In terms of configuration, this scenario is actually very similar to scenario 1 - we still need to configure the following:

  • Consumer subnet at the bridge domain level (marked as S_hared Between VRFs_).

  • Provider subnet at the EPG level (marked as S_hared Between VRFs_).

The major difference in this scenario is that we must now configure the scope of the contract as Global, and we must also export the contract from the provider side tenant to the consumer side tenant. On the provider side, export the contract under the Security Policies section of the tenant config. On the consumer side, we will consume the exported contract as a Contract Interface under the EPG. The final configuration looks something like this:

scenario-2-2

OK, that was easy enough - onto our last scenario where things get a little more complex.

Scenario 3: Inter-VRF Communication - Multiple EPGs Associated With a Single BD

For this last scenario, we are again looking at inter-VRF communication, but this time we have more than one EPG associated with a single bridge domain in both VRFs, as shown here:

scenario-3-1

If we follow our previous examples, we would configure our consumer side subnets at the BD level and our provider side subnets at the EPG level. Hold on a minute though, there’s an issue with that - if we assume that we have only a single subnet on the provider side and that we are using that subnet for all EPGs, does that mean we have to configure the same subnet on all provider EPGs? Well actually, we can’t do that - subnets configured on an EPG must not overlap with other subnets in that same VRF. So how do we get around this?

The answer to this conundrum is that we need to return to a configuration where the subnet is configured at the BD level on both provider and consumer sides. However, one of the consequences of this is that we then need to create a bidirectional contract relationship between the EPGs - in other words, we must provide and consume the contract from both sides in order to allow route leaking to happen. This ends up looking like this:

scenario-3-2

Now, if you are familiar with ACI and the way in which rules are applied in hardware, you may spot a downside to this approach. The problem with consuming and providing a contract in both directions is that this means double the number of security rules are programmed in hardware (TCAM). If you are a heavy user of contracts in your environments, doubling up on the number of rules may be a concern for you.

If you are concerned about this doubling of rules, there is a potential solution. In order to reduce the number of rules that we need to program, we can configure the contracts that we use to be unidirectional in nature. What I mean by this is that we can un-check the boxes entitled Apply Both Directions _and _Reverse Filter Ports when we create the contract. Now, having those boxes checked is generally important for allowing return traffic back to the initiating host (on the consumer side). So if we un-check them, how do we allow return traffic?

The answer is that we configure two contracts - one that allows traffic to be initiated (let’s say it allows any port –> port 80) and one allowing the reverse traffic (let’s say port 80 –> any port). Those contracts are then applied separately to the consumer and provider side EPGs, as shown in the following two diagrams:

scenario-3-3

scenario-3-4

By doing this, we cut the number of entries required in the hardware required by half compared to the solution shown in the first part of this scenario.

Hopefully this helps you out a bit if you are configuring this feature - thanks for reading.

Learning ACI - Part 11: Transit Routing

The 1.1(1j) & 11.1(1j) release of ACI introduced support for transit routing. Prior to this, the ACI fabric acted as a ‘stub’ routing domain; that is, it was not previously possible to advertise routing information from one routing domain to another through the fabric. I covered L3 Outsides in part 9 of this series where I discussed how to configure a connection to a single routed domain. In this post, we’ll look at a scenario where the fabric is configured with two L3 Outsides and how to advertise routes from one to another. Here is the setup I’m using:

Transit-Routing

In my lab, I have a single 4900M switch which I have configured with two VRFs (Red and Green) to simulate the two routing domains. In the Red VRF, I have one loopback interface - Lo60 (10.1.60.60) which is being advertised into OSPF. In the Green VRF, I have Lo70 (10.1.70.70) and Lo90 (10.1.90.90) which are also being advertised into a separate OSPF process.

On the ACI fabric side, I have two L3 Outsides which correspond to the two VRFs. These two L3 Outsides are associated with a single private network (VRF) on the ACI fabric. OSPF is configured on both L3 Outsides with regular areas.

At this point, my OSPF adjacencies are formed and my ACI fabric is receiving routing information from both VRFs on the 4900M, as can be seen in the following output (taken from the ‘Fabric’, ‘Inventory’ tab and then under the specific leaf node, under the Protocols and then OSPF section):

Screen Shot 2015-10-17 at 17.38.33

You can see from this output that the fabric is receiving all of the routes for the loopback addresses configured under the Red and Green VRFs on the 4900M. Let’s now take a look at the routing table for the Red VRF on the 4900M:

Screen Shot 2015-10-17 at 17.42.40

It’s clear from the above output that the Red VRF is not receiving information about either the 10.1.70.0 or 10.1.90.0 prefixes from the Green VRF - in other words, the ACI fabric is not currently re-advertising routes that it has received from one L3 Outside to the other.

Let’s say I now want to advertise the prefixes from the Green VRF (10.1.70.0/24 and 10.1.90.0/24) into the Red VRF - how do I enable that? The main point of configuration for transit routing is found under ‘External Networks’ under the L3 / Routed Outside configuration. The key here is the Subnets configuration - in previous versions of ACI, this has been used only to define the external EPG for policy control between inside and outside networks. Now however, the subnets configuration is used to also control transit routing in addition to policy control. Here are the options and what they are used for:

  • Security Import Subnet: Any subnet defined as a security import subnet is used for policy control. Effectively, a subnet defined in this way forms the external EPG - this is the same functionality that existed in previous ACI releases. If I define a subnet as a security import subnet, this subnet will be accessible from internal EPGs (or other external EPGs), as long as a suitable contract is in place. Importantly, this option has nothing whatsoever to do with the control of routing information into or out of the fabric.

  • Export Route Control Subnet: This option is used to control which specific transit routes are advertised out of the fabric. If I mark a subnet with export route control capability, I am telling the ACI fabric that I want those routes to be advertised to the external device. Note that this option controls the export of transit routes only - it does not control the export of internal routes configured on a bridge domain (you can see evidence of this in the 4900M routing table output above, in which you can already see one of my BD routes - 192.168.1.0/24 in the table).

  • Import Route Control Subnet: Similar to the export route control option, this option can be used to control which routes are allowed to be advertised into the fabric. Note that import route control is currently only available if BGP is used as the routing protocol.

How are these used in practice? Let’s start with a simple example. I’m going to advertise just one of my ‘Green’ subnets (let’s say 10.1.70.0/24) towards the ‘Red’ VRF. To do this, I add the 10.1.70.0 subnet to the L3 Out facing the Red VRF and mark it with the ‘export route control’ option:

Screen Shot 2015-10-17 at 19.33.01

Now if I check the ‘Red’ routing table, I see the 10.1.70.0 prefix advertised from the fabric:

Screen Shot 2015-10-17 at 19.34.37

If I add the 10.1.90.0/24 prefix to my subnets list, that transit route will also be advertised from the fabric to the Red VRF.

You might now be wondering how you would handle a large list of transit routes; would they all need to be individually entered into the subnets list? No - you can use the Aggregate Export option. This option is currently only available when “0.0.0.0/0” is used as the subnet; essentially, this option tells the fabric to advertise all transit routes. Checking the ‘aggregate’ option is important here - if you simply enter “0.0.0.0/0” as an export route control subnet without the aggregate option, the fabric will try and advertise the 0.0.0.0/0 route only. In my example, I’ve now removed the individual subnet and entered 0.0.0.0/0 with the aggregate option:

Screen Shot 2015-10-17 at 19.39.56

I now see both my subnets advertised to the Red VRF:

Screen Shot 2015-10-17 at 19.41.12

So that’s the routing taken care of - but there’s an additional step if you want traffic to flow. Remember, the ACI fabric models external destinations as EPGs. If you have two external destinations that need to communicate through the fabric, you must have both of those external destinations covered by a Security Import Subnet. As an example, if I wanted to allow hosts on 10.1.60.60 (part of the Red VRF) to talk to hosts on 10.1.70.70 (Green VRF), in addition to exporting the routes themselves (in both directions), I would need to define both of those subnets with the security import option:

4900M-External Subnets Configuration:

Screen Shot 2015-10-17 at 19.45.38

4900M-External-2 Subnets Configuration:

Screen Shot 2015-10-17 at 19.47.57

I would then need to provide / consume contracts between these networks for traffic to flow. In the above example, I could have just created a single 0.0.0.0/0 subnet on each side and marked it with export route control, aggregate export and security import options - that would effectively allow all routes and all destinations to communicate with each other (assuming contracts).

Thanks for reading!