26 minutes
Ansible for Networking - Part 2: The Lab environment
This is the second part in my ongoing series on using Ansible for Networking, showing how to use Ansible to configure and manage equipment from multiple networking vendors.
You can view the other posts in the series below: -
- Part 1 - Start of the series
- Part 3 - Cisco IOS
- Part 4 - Juniper JunOS
- Part 5 - Arista EOS
- Part 6 - MikroTik RouterOS
- Part 7 - VyOS
In the “Start of the series” post, I mentioned that the lab would consist of: -
- The KVM hypervisor running on Linux
- A virtual machine, running CentOS 8, that will run: -
- FRR - Acting as a route server
- Syslog
- Tacplus (for TACACS+ integration)
- Two routers/virtual machines of each vendor, one running as an “edge” router, one running as an “internal” router
- A control machine that Ansible will run from, over a management network to all machines
This post goes through the Hypervisor, setting up the CentOS 8 virtual machine, and the control machine.
The Hypervisor
The Hypervisor in this scenario is KVM, running on my Manjaro-based laptop. Rather than trying to run this on the KVM machines in my home network, using my laptop allowed me to make changes to the environment without impacting the services on the network in my house. The reason for Manjaro is simply that I like their i3wm implementation.
As KVM is baked into the Linux kernel, just about every distribution of Linux can support it.
Networking
For networking, I run three bridge interfaces: -
virbr0
- The default NAT bridge that is installed as part of KVM (which allows access out to the internet)virbr1
- An isolated network (i.e. one which allows traffic between VMs) that serves as a management network (named network)virbr2
- Another isolated network, that VLANs will be passed over between VMs (named vlan-bridge)
Rather than creating separate bridges for each separate network/subnet in use, I decided that a common bridge with VLANs tagged across it would be far easier to manage. Also, some of the virtual machine images are limited in how many “physical” (i.e. virtual NICs) interfaces they can support.
Anything else?
Other than the two extra network bridges, the KVM setup is largely default. I tend towards using virtio drivers where the virtual machine will support them (some networking OSs recommend the E1000 Intel NIC emulation instead), and all hard disk images are stored in /var/lib/libvirt/images
(as per the default KVM setup).
Virtual Machine Configuration
All the virtual machines have at least two network interfaces. Each machine has an interface connected to the Management network, and also an interface connected to the VLAN bridge. VLANs are carried across the vlan-bridge using 802.1q-based VLAN tagging. To check that your kernel supports this, run modprobe 8021q
. If no errors are returned, you can pass VLANs without an issue.
Using IDs for the lab
To make the lab easy to work with and troubleshoot, I am using an “ID” for each vendor. This ID will be used to form the VLANs, IP addressing and Autonomous System. This is different from the virtual machine IDs, which are generated by the host operating system.
This means that when I need to look at any issues in the lab (say, not seeing certain routes), I know which virtual machine to look at.
The ID system looks something like the below: -
Vendor | ID | Edge VLAN | Internal VLAN |
---|---|---|---|
Cisco IOS | 01 | 101 | 201 |
Juniper JunOS | 02 | 102 | 202 |
Arista EOS | 03 | 103 | 203 |
etc | etc | etc | etc |
Further to this, the IP addressing and Autonomous System numbers would be: -
- Vendor: Cisco
- ID:
01
- IPv4 Subnet on VLAN101:
10.100.101.0/24
- IPv4 Subnet on VLAN201:
10.100.201.0/24
- IPv6 Subnet on VLAN101:
2001:db8:101::/64
- IPv6 Subnet on VLAN201:
2001:db8:201::/64
- IPv4 Loopback Addresses:
192.0.2.101/32
and192.0.2.201/32
- IPv6 Loopback Address:
2001:db8:901:beef::1/128
and2001:db8:901:beef::2/128
- BGP Autonomous System Number: AS65101
Each is explained a bit further below, but using this system does make verification and troubleshooting much easier.
VLAN scheme
The VLAN scheme is defined as follows: -
- 1ID (e.g. 101, 102, 103) - Connectivity from the edge router to the CentOS Virtual Machine
- 2ID (e.g. 201, 202, 203) - Connectivity between the edge router and the internal router
IP addressing scheme
IPv4 addressing is defined as follows: -
10.100.1$ID.0/24
(e.g.10.100.101.0/24
,10.100.102.0/24
) - Connectivity from the edge router to the CentOS Virtual Machine10.100.2$ID.0/24
(e.g.10.100.201.0/24
,10.100.202.0/24
) - Connectivity between the edge router and the internal router10.15.30.0/24
- The management network, each machine gets an IP in this network192.0.2.0/24
- The loopback range, which will be used for Router IDs and iBGP connectivity
IPv6 addressing is defined as follows: -
2001:db8:1$ID::0/64
(e.g.2001:db8:101::1/64
,2001:db8:101::10/64
) - Connectivity from the edge router to the CentOS Virtual Machine2001:db8:2$ID::0/64
(e.g.2001:db8:201::1/64
,2001:db8:201::10/64
) - Connectivity between the edge router and the internal router2001:db8:9NN:beef::/64
- The loopback range, which will be used for Router IDs and iBGP connectivity
No management range has been assigned for IPv6 in this lab.
The CentOS 8 Virtual Machine
In my current role, nearly all of our Linux estate runs on Debian (apart from some Amazon EC2s that run Amazon Linux). Previously, most of my professional Linux experience has been with RHEL and/or CentOS.
Since starting my current role, CentOS 8 has been released. I decided to use this series to familiarise myself with the changes from CentOS 7 to CentOS 8.
The CentOS 8 Virtual Machine, which from now on will be referred to as netsvr-01 or netsvr, is (almost) entirely managed by Ansible. This includes the configuration for FRR (for routing), tac_plus (for TACACS+ integration), syslog-ng (for logging purposes), as well as managing firewalld and any package dependencies.
Install and user configuration
Installing the operating system was done manually, rather than using something like PXE, Vagrant, or Packer. I currently do not run a PXE server at home, and I have not used Vagrant or Packer with KVM previously. This is something I’ll look at in a future post, but for the purposes of this series, it doesn’t add any benefits.
The initial user setup (i.e. adding the Ansible user) was also not automated. This is because I wanted to avoid the cyclical dependency of needing a user with sufficient privileges that Ansible would use, to allow Ansible to create users. There are ways around this (potentially using something like cloud-init
or other methods), but for now adding the user myself was sufficient.
Tooling
Rather than templating configuration files (and making liberal usage of the Ansible copy
task), I decided to try and make use of the native CentOS 8 tooling where possible. This includes dnf for package management, firewalld for firewall management and NetworkManager for network interface management.
Many of these tools also have associated Ansible modules, with excellent documentation.
However in using this approach, I came across some interesting issues and caveats.
What caveats?
Ansible and NetworkManager
Ansible previously used the networkmanager-glib library for interacting with NetworkManager. However this library has been deprecated, and is not included in CentOS 8. Instead, the recommended library is networkmanager-libnm.
As of writing this post, Ansible (v2.9.5
) will not interact with NetworkManager unless networkmanager-glib is installed. This dependency issue (and compatibility for networkmanager-libnm) is due to be fixed, and has been merged into the Ansible master branch, but it is currently scheduled for version 2.10
.
For now, all network additions and changes on the netsvr machine will be done manually using nmcli
. This avoids spending time creating network configuration templates (in Jinja2) that will not be required soon anyway.
In the meantime, I have commented out the NetworkManager-specific sections of my playbooks, and will re-enable them when the support is available.
FRR packages and dependencies
The latest RPM packages for FRR (at the time of writing) have dependencies on libraries that are not present in CentOS 8. This is not entirely surprising, as the latest release was packaged for CentOS 7, rather than CentOS 8. As CentOS 7 is still the most common version of CentOS, and still supported, I expect this is a problem across many other applications too.
Where libraries and dependencies still exist in CentOS 8, I have been able to install CentOS 7 packages without issue (for example, with tac_plus). However with FRR I am relying on the version that is in the CentOS 8 repositories, which is a couple of versions behind the current one.
This is not a major issue, as all the features I require are in this version. If I really do need them, FRR do provide a guide for compiling FRR on CentOS 8.
I believe as time goes on, CentOS 8 (or at least RHEL8-based systems) will become the “standard” version to target (for all CentOS/RHEL-based RPMs/releases), and problems like this will go away.
Anything else?
For the most part, I have not found any other issues in utilising CentOS 8, rather than say Debian Buster (my usual choice) or CentOS 7 (the version with “better” application support currently). For example, dnf is very similar to yum in everyday usage, so managing packages with Ansible doesn’t require big changes conceptually.
FRR Configuration
FRR, or Free Range Routing, is a notable fork of Quagga that provides a number of routing protocols (and other useful network protocols, like VRRP and LDP) on Linux. It also has the vtysh
shell package, which allows you to configure, verify and monitor using very Cisco-like syntax. It can be used to turn just about any Linux device into a router, or used to allow a server to use dynamic routing.
In this lab, FRR is configured as a route server, and will be set up to allow peering with all the “edge” routers from each vendor.
Now please refer to the above where I said: -
Rather than templating configuration files (and making liberal usage of the Ansible
copy
task), I decided to try and make use of the native […] tooling where possible
So how am I managing the FRR configuration? With templated configuration files and making use of the template
(i.e. the copy
task, but with templated variables) task…
Why?!
The Ansible module for configuring BGP in FRR (documentation here) covers most common use cases. If you’re setting up standard BGP (IPv4 or IPv6) peering, route reflectors, and all the usual configuration options (e.g. route-map
, prefix-list
etc), then all of these use cases are covered. Currently though, it does not support dynamic BGP neighbours.
Traditionally, BGP requires that you configure your peers statically, with the IP address of the specific neighbour (e.g. neighbor 192.168.1.1 remote-as 65001
). You can use techniques like route reflection or confederation to reduce the amount of configuration required, but it still requires a known (and therefore static) set of peers to configure.
Recently, a number of vendors have added a feature called dynamic BGP peering. This means that one side can listen for peers, and those that meet certain requirements can form a BGP peering session with it.
Dynamic BGP peering originated because of the recent trend for using BGP in the data centre. Allowing peers to dynamically form means less static configuration. It also allows common configuration across multiple devices, as opposed to different peering configuration based upon where it is installed in the network. Devices can be pre-provisioned with the same configuration, and added to the network with ease, regardless of where they are physically connected.
To configure dynamic peers, you configure either a “prefix” (i.e. a subnet/range of IP addresses that peers could be coming from), an interface (e.g. eth0
), or both, as part of a BGP peer group (essentially a set of configuration parameters common across peers).
If a device attempts to peer with the “listening” BGP process and comes from either the “prefix” or “interface” specified, then as long as they meet the other configuration parameters, a BGP peering session will be formed.
Admittedly in this lab, only the “edge” router from each vendor will peer with the netsvr machine. FRR will only ever see one device from the prefix range attempt to peer with it. However it does make it easier to add a secondary device (say, to test failover), as the FRR configuration would not require any changes.
Ansible Role
I am using Ansible Roles to configure FRR, with a directory structure as follows: -
$ tree frr
frr
├── defaults
│ └── main.yml
├── files
├── handlers
│ └── main.yml
├── meta
│ └── main.yml
├── README.md
├── tasks
│ └── main.yml
├── templates
│ └── bgpd.conf.j2
├── tests
│ ├── inventory
│ └── test.yml
└── vars
└── main.yml
To create this Ansible role, I used ansible-galaxy init frr
. This automatically creates the directory structure, as well as all the YAML files and test inventory
file.
Tasks
The tasks/main.yml
looks like the below: -
## tasks file for frr
##
- name: NetworkManager libnm
dnf:
name: NetworkManager-libnm
state: present
#########################################
## Commenting out until NMCLI is fixed #
#########################################
##- name: Create loopback interface
## nmcli:
## type: bridge
## autoconnect: yes
## conn_name: bridge-loopback
## ifname: bridge-lo0
## ip4: "{{ loopback.ip4 }}"
## state: present
- name: Install FRR
dnf:
name:
- frr
state: present
- name: Enable BGP
lineinfile:
path: /etc/frr/daemons
regexp: 'bgpd=no'
line: 'bgpd=yes'
register: frr_bgp_daemon
- name: Enable zebra
lineinfile:
path: /etc/frr/daemons
regexp: 'zebra=no'
line: 'zebra=yes'
register: frr_zebra_daemon
- name: BGP Config
template:
src: bgpd.conf.j2
dest: /etc/frr/bgpd.conf
owner: frr
group: frr
register: frr_bgp_config
- name: Allow BGP through FirewallD
firewalld:
port: 179/tcp
permanent: yes
state: enabled
zone: public
- name: Run FRR
service:
name: frr
state: restarted
enabled: true
when: frr_bgp_daemon.changed or frr_zebra_daemon.changed or frr_bgp_config.changed
As noted previously, until the Ansible NetworkManager module works correctly, it is commented out. To summarize what is done here: -
- Install
networkmanager-libnm
andfrr
(using dnf) - Update
/etc/frr/daemons
to enable BGP and the Zebra daemons - Generate the
/etc/frr/bgpd.conf
file for BGP configuration - Allow BGP through the firewall (using firewalld)
- Restart FRR, if the configuration of either
/etc/frr/daemons
or/etc/frr/bgpd.conf
has changed- In a production scenario, reloading would be preferable, but restarting on changes is fine in a lab
The register
option creates a variable, that is updated with the status of the task. If the task has a status of changed
(i.e. the configuration files have been updated), then $variable.changed
(e.g. frr_bgp_daemon.changed
) evaluates to True
.
Template
The template that is used to generate the bgpd.conf
configuration file looks like the below: -
frr version 7.0
frr defaults traditional
!
hostname netsvr-01
!
!
!
router bgp {{ frr['asn'] }}
bgp log-neighbor-changes
no bgp default ipv4-unicast
{% for group in frr['bgp'] %}
neighbor {{ group }} peer-group
neighbor {{ group }} remote-as {{ frr['bgp'][group]['asn'] }}
{% if 'ipv4' in frr['bgp'][group]['listen_range'] %}
bgp listen range {{ frr['bgp'][group]['listen_range']['ipv4'] }} peer-group {{ group }}
{% endif %}
{% if 'ipv6' in frr['bgp'][group]['listen_range'] %}
bgp listen range {{ frr['bgp'][group]['listen_range']['ipv6'] }} peer-group {{ group }}
{% endif %}
{% endfor %}
address-family ipv4 unicast
{% for group in frr['bgp'] %}
{% if 'ipv4' in frr['bgp'][group]['address_family'] %}
{% if 'unicast' in frr['bgp'][group]['address_family']['ipv4']['safi'] %}
neighbor {{ group }} activate
{% if 'networks' in frr['bgp'][group]['address_family']['ipv4'] %}
{% for network in frr['bgp'][group]['address_family']['ipv4']['networks'] %}
network {{ network }}
{% endfor %}
{% endif %}
{% endif %}
{% endif %}
address-family ipv6 unicast
{% if 'ipv6' in frr['bgp'][group]['address_family'] %}
{% if 'unicast' in frr['bgp'][group]['address_family']['ipv6']['safi'] %}
neighbor {{ group }} activate
{% if 'networks' in frr['bgp'][group]['address_family']['ipv6'] %}
{% for network in frr['bgp'][group]['address_family']['ipv6']['networks'] %}
network {{ network }}
{% endfor %}
{% endif %}
{% endif %}
{% endif %}
{% endfor %}
!
!
line vty
!
For those who haven’t used Jinja2 before (or Python, which Jinja2 shares some common syntax with) can look a bit opaque, so to summarise each section: -
router bgp {{ frr['asn'] }}
bgp log-neighbor-changes
no bgp default ipv4-unicast
- Start BGP, using the Autonomous System number provided by the
frr['asn']
variable - Log any changes in neighbour states (e.g. neighbour up, neighbour down)
- For any neighbours configured, do not automatically enable IPv4 BGP peering
- You can activate it on a per peer or group basis instead
The reason for using bgp default ipv4-unicast
is useful when you run different address families (e.g. l2vpn
or evpn
), and stops FRR automatically configuring a standard IPv4 BGP session to every peer (or peer-group) defined.
{% for group in frr['bgp'] %}
neighbor {{ group }} peer-group
neighbor {{ group }} remote-as {{ frr['bgp'][group]['asn'] }}
For all groups specified in the frr['bgp']
variable, create: -
- The
peer-group
, namedgroup
(which in this lab would becisco
orjuniper
ormikrotik
for example) - Define the
remote-as
(i.e. the peer’s autonomous system) for the group- This is derived from the
frr['bgp'][$THIS-SPECIFIC-GROUP]['asn']
variable (each group will have a different ASN)
- This is derived from the
{% if 'ipv4' in frr['bgp'][group]['listen_range'] %}
bgp listen range {{ frr['bgp'][group]['listen_range']['ipv4'] }} peer-group {{ group }}
{% endif %}
- If there is an IPv4 section in the group, create a dynamic listening range
- The listening range will be an IPv4 prefix/subnet
{% if 'ipv6' in frr['bgp'][group]['listen_range'] %}
bgp listen range {{ frr['bgp'][group]['listen_range']['ipv6'] }} peer-group {{ group }}
{% endif %}
- As per the above, but for IPv6 (a listening range, but using an IPv6 prefix)
address-family ipv4 unicast
- Enable the IPv4 unicast address family (i.e. standard IPv4 BGP peering)
{% for group in frr['bgp'] %}
{% if 'ipv4' in frr['bgp'][group]['address_family'] %}
{% if 'unicast' in frr['bgp'][group]['address_family']['ipv4']['safi'] %}
neighbor {{ group }} activate
There are three nested levels here (i.e. an if
statement, inside an if
statement, inside a for
loop)
- For loop - for all groups in the
frr['bgp']
variable, then… - First if statement - If IPv4 is fined as part of the the group’s
address_family
variable, then… - Second if statement - If
unicast
exists in thesafi
variable then… - Activate the peer group
{% if 'networks' in frr['bgp'][group]['address_family']['ipv4'] %}
{% for network in frr['bgp'][group]['address_family']['ipv4']['networks'] %}
network {{ network }}
{% endfor %}
{% endif %}
{% endif %}
{% endif %}
More nested if
statements! The above will only evaluate if the IPv4 unicast peer group is set to be activated, as otherwise any associated networks would never be advertised out.
- If the above is evaluated as true, then…
- For all networks listed in the
networks
variable, create a network statement (i.e. advertising a subnet)
address-family ipv6 unicast
{% if 'ipv6' in frr['bgp'][group]['address_family'] %}
{% if 'unicast' in frr['bgp'][group]['address_family']['ipv6']['safi'] %}
neighbor {{ group }} activate
{% if 'networks' in frr['bgp'][group]['address_family']['ipv6'] %}
{% for network in frr['bgp'][group]['address_family']['ipv6']['networks'] %}
network {{ network }}
{% endfor %}
{% endif %}
{% endif %}
{% endif %}
{% endfor %}
This does the same for IPv6 as the previous statements did for IPv4.
If you are not familiar with Jinja2 syntax, this may look daunting. I would recommend looking at resources like this and here to start with, and then soon all of the above will start to make sense.
Variables
I referenced the use of multiple variables in the template above, but where do these variables come from? In this case, I am using Ansible host_vars
, which are host specific variables. They can be fined in an INI-style format, or YAML. I prefer YAML for this, as while you have to be careful with spaces and indentation, they are grouped together in a way which makes sense to me.
The variables I have used for the FRR configuration are as follows: -
frr:
asn: 65430
bgp:
cisco:
asn: 64101
listen_range:
ipv4: 10.100.101.0/24
ipv6: "2001:db8:101::0/64"
address_family:
ipv4:
safi: unicast
networks:
- 192.0.2.1/32
ipv6:
safi: unicast
networks:
- "2001:db8:999:beef::1/128"
In the template above, each section of the variable (i.e. the squared-brackets) refers to the next “level” down in the YAML variables defined above.
For example, frr['bgp'][group]['address_family']['ipv4']['networks']
, would refer to: -
frr:
bgp:
cisco:
address_family:
ipv4:
networks:
- 192.0.2.1/32
The reason for group
not having single quotation marks is because it derived a for loop, rather than being a hardcoded string. This allows you to loop through each group, rather than having to add sections of the template that are specific to each vendor/group.
When the BGP configuration template is generated, using the variables provided above, the output looks like so: -
router bgp 65430
bgp log-neighbor-changes
no bgp default ipv4-unicast
neighbor cisco peer-group
neighbor cisco remote-as 64101
bgp listen range 10.100.101.0/24 peer-group cisco
bgp listen range 2001:db8:101::0/64 peer-group cisco
address-family ipv4 unicast
neighbor cisco activate
network 192.0.2.1/32
address-family ipv6 unicast
neighbor cisco activate
network 2001:db8:999:beef::1/128
The indentation could be tidied up, but the above is a fully functioning FRR BGP configuration, as seen below: -
netsvr-01# show running-config
Building configuration...
Current configuration:
!
frr version 7.0
frr defaults traditional
hostname netsvr-01
no ip forwarding
no ipv6 forwarding
!
router bgp 65430
bgp log-neighbor-changes
no bgp default ipv4-unicast
neighbor cisco peer-group
neighbor cisco remote-as 64101
bgp listen range 10.100.101.0/24 peer-group cisco
bgp listen range 2001:db8:101::/64 peer-group cisco
!
address-family ipv4 unicast
network 192.0.2.1/32
neighbor cisco activate
exit-address-family
!
address-family ipv6 unicast
network 2001:db8:999:beef::1/128
neighbor cisco activate
exit-address-family
!
line vty
netsvr-01# show bgp ipv4 summary
IPv4 Unicast Summary:
BGP router identifier 192.168.122.81, local AS number 65430 vrf-id 0
BGP table version 1
RIB entries 1, using 160 bytes of memory
Peers 1, using 21 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
*10.100.101.253 4 65101 4 4 0 0 0 00:01:03 0
Total number of neighbors 1
* - dynamic neighbor
1 dynamic neighbor(s), limit 100
syslog-ng Configuration
syslog-ng
is a syslog daemon, that in this scenario will be used for storing logs from each network device. This means you can look at logs from across the network in one place, rather than retrieving them from each device manually.
Ansible role
The role is created with ansible-galaxy init syslog
. The directory structure is as follows: -
$ tree syslog
syslog
├── defaults
│ └── main.yml
├── files
│ └── syslog-remote.conf
├── handlers
│ └── main.yml
├── meta
│ └── main.yml
├── README.md
├── tasks
│ └── main.yml
├── templates
├── tests
│ ├── inventory
│ └── test.yml
└── vars
└── main.yml
Tasks
The tasks/main.yml
file looks like the below: -
## tasks file for syslog
- name: Remove rsyslog
dnf:
name:
- rsyslog
state: absent
- name: Install syslog-ng
dnf:
name:
- syslog-ng
state: present
- name: Remote Syslog
copy:
src: syslog-remote.conf
dest: /etc/syslog-ng/conf.d/syslog-remote.conf
register: syslog_conf
- name: Remote Syslog directory
file:
state: directory
path: /var/log/remote
owner: root
group: root
mode: 0755
- name: Reload syslog-ng
service:
name: syslog-ng
state: restarted
enabled: yes
when: syslog_conf.changed
- name: Allow syslog through FirewallD
firewalld:
service: syslog
permanent: yes
state: enabled
zone: public
Steps: -
- Remove
rsyslog
with dnf (as it conflicts withsyslog-ng
) - Install
syslog-ng
with dnf - Add the
syslog-remote.conf
file - Add the
/var/log/remote
directory to store logs from the network devices - Reload
syslog-ng
, only if the configuration has changed - Allow
syslog-ng
through the firewall with firewalld
We’re not using any templating or providing any extra variables, because the configuration required is static.
Syslog-ng remote configuration
The configuration required to enable remote logging within syslog-ng
looks like the below: -
source net { udp(); };
destination remote { file("/var/log/remote/${FULLHOST}" template("${ISODATE} ${HOST}: ${MSGHDR}${MESSAGE}\n") ); };
log { source(net); destination(remote); };
The files will get created as $HOSTNAME
or $IP
in /var/log/remote
, in the format of ISODATE HOSTNAME: %SYSLOG-PROGRAM Syslog message
.
An example of the output can be seen below: -
$ pwd
/var/log/remote
$ ls
10.100.101.253
$ cat 10.100.101.253
2020-02-23T14:23:21-05:00 10.100.101.253: %CRYPTO-6-ISAKMP_ON_OFF: ISAKMP is OFF
2020-02-23T14:23:21-05:00 10.100.101.253: %CRYPTO-6-GDOI_ON_OFF: GDOI is OFF
2020-02-23T14:23:21-05:00 10.100.101.253: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 10.100.101.254 port 514 started - CLI initiated
2020-02-23T14:23:37-05:00 10.100.101.253: %BGP-5-ADJCHANGE: neighbor 10.100.101.254 Up
You could then find out if for example, multiple BGP peers had dropped by doing grep -i bgp /var/log/remote/* | grep -i down
. This would return all the files (which are named based upon the devices) that contain BGP drops.
With tools like the Elastic stack, Graylog or Splunk, its now possible to index logs (making them quicker to search based upon the type of queries used), create dashboards and alerts based upon them, and much more. Still, running syslog-ng
(or other syslog daemons) can still help you gather huge insights into where you may be having issues in your network.
tac_plus Configuration
tac_plus
is a daemon that can be used for TACACS+-based authentication and authorization (as an alternative to RADIUS). This allows you to manage your users centrally on a server (such as this one) so that you can login to any device in the network with your username and password. It can also assign privileges to the user, based upon “privilege” levels.
The “privilege” levels are configured in groups, which users can added to.
Ansible role
The role is created with ansible-galaxy init syslog
. The directory structure is as follows: -
$ tree tacplus
tacplus
├── defaults
│ └── main.yml
├── files
├── handlers
│ └── main.yml
├── meta
│ └── main.yml
├── README.md
├── tasks
│ └── main.yml
├── templates
│ └── tac_plus.conf.j2
├── tests
│ ├── inventory
│ └── test.yml
└── vars
└── main.yml
Tasks
The tasks/main.yml
file looks like the below: -
## tasks file for tacplus
- name: Nux Repo
yum_repository:
name: nux-misc
description: nux-misc
baseurl: http://li.nux.ro/download/nux/misc/el7/x86_64/
enabled: 0
gpgcheck: 1
gpgkey: http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro
- name: Install tcp-wrappers (not in CentOS 8)
dnf:
name: 'http://mirror.centos.org/centos/7/os/x86_64/Packages/tcp_wrappers-libs-7.6-77.el7.x86_64.rpm'
state: present
- name: Install tac_plus
dnf:
name: tac_plus
enablerepo: nux-misc
state: present
- name: Generate configuration
template:
src: tac_plus.conf.j2
dest: /etc/tac_plus.conf
register: tac_conf
- name: Restart tac_plus
service:
name: tac_plus
state: restart
enabled: yes
when: tac_conf.changed
- name: Allow tacacs through FirewallD
firewalld:
port: 49/tcp
permanent: yes
state: enabled
zone: public
Steps: -
- Add the
nux-misc
repository to yum (which dnf makes use of)- Disabled by default, only used when it is specifically called for
- Install tcp-wrappers (deprecated in CentOS 8), a tac_plus dependency, directly from an RPM file
- Install tac_plus using dnf, enabling the nux-misc repository to do so
- Generate the tac_plus configuration
- Restart tac_plus, if the configuration is changed
- Allow the tacacs port through the firewall, using firewalld
Configuration template
The tac_plus.conf.j2
configuration template looks like the below: -
## Created by Henry-Nicolas Tourneur([email protected])
## See man(5) tac_plus.conf for more details
## Define where to log accounting data, this is the default.
accounting file = /var/log/tac_plus.acct
## This is the key that clients have to use to access Tacacs+
key = {{ tacacs_key }}
group = netwrite {
default service = permit
service = exec {
priv-lvl = 15
}
}
{% for user in netusers %}
user = {{ user }} {
member = netwrite
login = des {{ netusers[user]['tacpwd'] }}
}
{% endfor %}
Compared to the FRR bgpd.conf.j2
file, there are far fewer parts to generate. We supply the tacacs_key
, which is used to encrypt the messages between the network device and the tac_plus
server. We also supply a list of users, with passwords, to generate this file.
The netwrite group has priv-lvl 15, which is analogous to full admin access on each network device. It is possible to create groups with read-only permissions, or with additional permissions. You could create a group so that certain write commands are allowed (for example, reloading a BGP neighbour or clearing statistics on an interface), but all other commands are restricted.
The actual list of users is defined in host_vars
.
Host Variables
The host_vars
specific to tac_plus
are: -
tacacs_key: supersecret
netusers:
yetiops:
tacpwd: !vault |
$ANSIBLE_VAULT;1.1;AES256
66336131323637326166316232623161663630373739613137366266633937306662323363333039
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxREDACTEDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxREDACTEDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxREDACTEDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
6165
davethechicken:
tacpwd: !vault |
$ANSIBLE_VAULT;1.1;AES256
19784782343345848123148123094812389452340958230495809234846666642381109434123412
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxREDACTEDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxREDACTEDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxREDACTEDxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
6162
The passwords that tac_plus
expects are DES-hashed (yes, not even 3DES!). The easiest way to generate them is by using tac_pwd
: -
$ tac_pwd
Password to be encrypted: test
CCVwN31H4K74A
The user passwords are then encrypted in this file using Ansible Vault, which allows you to store sensitive data in version control, as they require an encryption key to unlock.
To encrypt a string, you would use the following: -
$ ansible-vault encrypt_string 's3cr3tp8ss' --name 'pass'
New Vault password:
Confirm New Vault password:
pass: !vault |
$ANSIBLE_VAULT;1.1;AES256
64373235663534646635306363626365376537343137393136623863626332303235386264393237
3435313266336633633430646462393138353331633734340a356265336366663030313338393965
31643738383461616465626435376265333739663031366636353865373938663236653262396366
3833346263653436380a333936633363303038646333613832313564316566313534373537396433
3366
Encryption successful
You can then copy and paste this secret into your host_vars
.
You can store your encryption keys in local files (and reference them with ansible-playbook --vault-password-file /path/to/vault-keys
), so that Ansible does not need to ask for them when you run your playbooks.
Alternatively, you can make Ansible ask you for the encryption key, meaning you can then store the encryption key in whatever password management system you choose. To do this, see below: -
$ ansible-playbook centos.yaml --ask-vault-pass
Vault password:
PLAY [centos] *************************************************
TASK [Gathering Facts] ****************************************
ok: [10.15.30.252]
[...]
Without this, your Playbook run will fail, as it will not be able to decrypt your keys.
Generated configuration file
The configuration file for tac_plus
when generated from the template looks like the below: -
## Created by Henry-Nicolas Tourneur([email protected])
## See man(5) tac_plus.conf for more details
## Define where to log accounting data, this is the default.
accounting file = /var/log/tac_plus.acct
## This is the key that clients have to use to access Tacacs+
key = supersecret
group = netwrite {
default service = permit
service = exec {
priv-lvl = 15
}
}
user = yetiops {
member = netwrite
login = des ###REDACTED###
}
user = davethechicken {
member = netwrite
login = des ###REDACTED###
}
With this, you can then configure TACACS+-based authentication on your network, and then login to your network devices with the users defined in this file.
Top-level playbook
The playbook that includes all of the roles, as well as defining what hosts it will run on, is in the directory level below the roles: -
$ tree -L 1
.
├── ansible.cfg
├── ansible.log
├── centos.yaml
├── epel <- Role directory
├── frr <- Role directory
├── host_vars
├── inventory
├── README.md
├── syslog <- Role directory
└── tacplus <- Role directory
5 directories, 5 files
The contents of the centos.yaml
playbook are: -
- hosts: centos
become: true
become_method: sudo
tasks:
- import_role:
name: epel
- import_role:
name: frr
- import_role:
name: syslog
- import_role:
name: tacplus
There is an additional role here (epel
), but all this does is install the epel
release package (Extra Packages for Enterprise Linux).
When you run this playbook, each role will be imported and ran in order (so epel
, frr
, syslog
, then tacplus
). It will also, by default, pick up the host_vars/$IP_ADDRESS.yaml
file for host-specific variables (ensure that $IP_ADDRESS
is replaced with the IP or hostname you have defined in your Ansible inventory).
Other files
I also have a few settings in the ansible.cfg
file: -
[defaults]
inventory = ./inventory
timeout = 5
log_path = ./ansible.log
The above specifies my inventory file as ./inventory
, adds a timeout (more useful for network devices, but I’m keeping it for consistency) and also creates a log file of every Playbook run. This makes it easier to debug, or go back and look at where changes were made that potentially broke the playbook runs.
Ansible control machine
As Ansible can run from just about anywhere, the choice of how you invoke your playbooks is down to personal preference.
In a production scenario, you would usually have either a machine (or machines) that have access to your devices, and run your playbooks from there. This means that a team of people can make changes and run them from the same place (rather than playbooks going out of sync on people’s workstations). Alternatively, you can run something like Ansible Tower or AWX (the upstream project that Ansible Tower builds upon) to manage your infrastructure.
In this scenario, as it is a lab environment, I am running all of the playbooks from the same laptop that is running the lab. I use passwordless SSH where it is supported (not every network vendor does support this), and I maintain all my playbooks in a Git repository (that I will make public during the series).
Summary
My approach to setting up the lab environment has been to make use of the native tools available (either on my laptop, or the virtual machines themselves), while also trying to keep things as simple as possible. Thanks to taking this approach, and because it is all managed using Ansible Playbooks, I can easily recreate this setup on other machines.
The next few posts in the series will get into managing actual network devices themselves. I also have a few bonus posts to make during this series, thanks to the generosity and help from the readers of this site.
Hopefully this will help get you up and running with your own lab, so you can test these kind of scenarios yourself!
devops sysadmin ansible config management networking
technical sysadmin config management networking
5423 Words
2020-02-23 15:10 +0000