I have a few VMs and PMs around the house that I’d setup over time and I’d now like to rebuild some, not to mention just simplify the whole lot.
How the hell do I get from a working system to an equivalent ansible playbook without many (MANY) iterations of trial & error - and potentially destroying the running system??
Ducking around didn’t really show much so I’m either missing a concept / keyword, or, no-one does this.
Pointers?
TIA
I went through this about 6 months ago.
Just build playbooks from basic to specific. I did so in three parts:
- Container creation
- Basic settings common to all my hosts
- Specific service config & software
Ansible assumes you have a hierarchy of roles to apply for each service, so layering playbooks this way should help
Ducking around
giggle
What I did to get rid of my mess, was to containerize service after service using podman. I mount all volumes in a unified location and define all containers as quadlets (systemd services). My backup therefore consists of the base directory where all my container volumes live in subdirectories and the directory with the systemd units for the quadlets.
That way I was able to slowly unify my setup without risking to break all at once. Plus, I can easily replicate it on any server that has podman.
Do you have a GitHub repo? As I am building my system like this and was thinking of exactly using Podman and quadlets.
No, I keep that private to minimize the information I leak about what I host, sorry. (I also don’t do git-ops for my server; I back the mentioned directories up via kopia so in case of recovery I just restore the last working state of data+config. I don’t have much need to version the configs.)
I would copy the existing system onto a new system:
- Update system to the latest packages
- Create a new base system using the same distro
- Check which packages are not on the new system, add them to your playbook
- Install packages on new system
- This will take some time. Run a find of all files and pass them to md5sum or sha512sum to get a list of files with their checksum. Compare the list from the old system to the new system.
- Update your playbook with these findings. Template is probably the way to go, Lineinfile might be good as well, use copy if nothimg else works.
- Check firewall settings and update your playbook.
Anyhow this will take some iterations, but while you have a copy of your ‘production’ system, you can test on your ‘test’ machine until you have the same functionality.
oh the find with the hash sum is good advice! I would have done this but manually, maybe with the double commander sync dirs tool.
but also, for configs this might be the best time to move your custom config to ordered dropin files for all things that support it.
Hmm, that’s not a bad shout actually…
I can fire up VMs to replicate the real system and maybe (depending on space) keep them as a testbed for future mods.
Thanks, good idea.
I have my moments… 😉 Feel free to pm me if you need more advice.
the first step is workout what you did, what did you install and where from. Then what config files got edited.
Much like a playbook for a disaster recovery test
Next is using some of the builtin modules like package and copy, make a very noddy playbook that will install stuff and copy up the config. if you have vms, you can use a test vm to see if the playbook works.
If you’ve not played ansible than this might help 👉 https://www.jeffgeerling.com/project/ansible-101-youtube-series
Everything this guy makes is really well made. I had the pleasure of working with him many years ago, and he’s just as kind in person as he seems in his videos.
Lucky you.
Yep, I’d seen some other videos of his, he does seem genuinely interested in passing on his knowledge
Watching him, makes me think I should do videos of stuff I know. But not found the want yet
Yeah, but it’s hard work to make it look easy.
I made a video for work once, it was basically a recorded Teams call, but, good god, I was awful 🤭
Good point, not thought of that - thanks
You will need many iterations of trial and error. No way.
You can speed up testing your playbook by using Molecule or something similar. Don’t touch your working VMs until you get a service (role) set up correctly in your test environment. If you need to set up multiple services in a single VM, you can automate their deployment sequentially, of course.
P. S. I don’t like Ansible and won’t recommend it because it is full of bugs and non-obvious behavior. However I didn’t investigate alternatives and can’t suggest a better one.
Could you elaborate a little bit about “full of bugs” and “non-obvious behaviour”? I use Ansible at work for a couple of years already and never encountered anything like that. (I have about 10 playbooks, about 30 roles, about 20 linux servers that I administer)
Same question. But with 100s of playbooks, and thousands of servers. This feels like someone had a bad experience with their first 30 minutes of ansible and gave up before looking at the command reference.
No, not 30 minutes. For the first time I spent couple of weeks just for reading documentation and experiments. It was about 8 years ago IIRC. But since that time when I need something more complex than install a package or copy a file, I feel myself like a 30-minutes user because it does not work as I expect.
Fair enough. I honestly didn’t mean this as an insult. I have seen the same type of review from people who join teams that I’m on when they get told about ansible.
It certainly isn’t perfect. And there was a period of time about 5 years ago where a lot of change was happening at once.
Thanks for sharing your opinion
No, I can’t. I use it only occasionally, so I don’t remember everything. But many times configurations didn’t work as described in documentation and I had to find a different way to achieve a required result. Sometimes this behavior changed from release to release. This thing doesn’t seem something that I can rely on. But we use it in our company many years, so switch to another tool would be painful.
afaik, there’s no way “convert” a running system into a playbook. I’d recommend looking at what your systems have in common (installed packages, timezone etc…) and create playbooks based on that and work your way up from there.
Yeah, at least others are confirming what I had assumed, rather than everyone pointing me to a blindingly obvious tool that did it all for me!
Treat your current setup as your production server. Clone it first, and avoid making any changes directly to production. Any planned changes should ideally be tested on your clone or a pre-production environment first.
As far as I know, there’s no automated way to scan your existing system and generate Ansible playbooks for you. It’s a good habit to document every change you make to a system, ideally in a wiki or something similar, so you can refer back to it later. This is usually done in business environments, but it can be especially helpful if you haven’t run into a situation like this before.
For home use, I like to find roles or collections for common tasks on Ansible Galaxy or GitHub. These are often maintained by others and usually support multiple distributions. The downside is that they might not work for you unless you’re using a mainstream distro like Ubuntu or Debian.
If I were you, I’d make a list of all the services you need (Docker, NGINX, etc.), then search for roles on Ansible Galaxy sorted by download count. Use those to build a fresh server, and try to export your data from the old one to the new one. This is usually the easiest and least time-consuming approach, assuming your distributions are compatible.
Unless you’re genuinely interested in infrastructure automation, writing playbooks by hand will be tedious and time-consuming. Save that as a last resort.
Ok, good point on not writing from scratch - which is what I had been doing for the first few… whixh is rewarding to learn how it works, but it is slow.
Thanks
@Cyber Yeah it’s gonna be pretty manual as others have mentioned. Some areas to look at:
- Filesystem provisioning, mounts, etc.
- Packages
- Users, groups
- Time zone, locale language, time format etc.
- /etc/
- /root/ and /home/
- SSH settings
- Services
- Cron jobs/systemd timersThere is a bit of overlap between some of those categories. Some bits are going to see more or less use on VMs vs physical. And remember that in ansible there are built in modules for a lot of functionality.
Hadn’t thought about all the locale, etc… good point, thanks
@Cyber If you have some old wiki notes on how the system was setup originallythen it night be easier to ignore the current system and translate the wiki instructions into ansible. Still manual, but easier than reverse engineering. Another thing you can look at is bash history. Apart from backing up/cloning the system before you start I would also get a copy of the bash history for the various users and add it to a wiki or issue too. It will be useful.
Yeah… notes… they started about 50% of the way through building the system.
Now, my notes are great, but some of these devices are ~10 years old.
But, yep, I totally agree, notes are a damn good thing to have.
Not thought about bash history though, interesting point, but I think that only goes back a short duration?
@Cyber Yeah the bash defaults are incredibly limited by default, something like 1000 entries, 2000 bytes. I always make those something like 100,000 and 1MB. So the defaults can definitely bite you on an existing system, it may not have stored every command.
https://superuser.com/a/664061@Cyber Bash also seems to default to only writing out the history entries when you cleanly exit, so I’ve definitely got gaps in my history when I killed a terminal or SSH session. When I leave work I do a quick “history -a” to append new entries that haven’t been written out yet. Some people modify their bash prompt so that it writes each entry out instantly which I haven’t done, but I think it would be a saner default.