Phillip Rhodes' Weblog

Wednesday May 14, 2025

AI is not magic

I have a T-shirt I wear sometimes, that reads "AI is like magic... but real." It's a nice bit of quirky, harmless fun. I wear it because it amuses me, and as a conversation starter. But it's a joke. AI is NOT magic.

Why is this an important observation? Simply because too many people get caught up in the hype, and start to think that they can just cast the magic incantation "Use AI" and all of their problems will be solved. But it does not work that way. Building systems using AI is still hard work and still requires (depending on exactly what you're building) a lot of knowledge, expertise, and engineering talent. Not to mention a healthy dose of patience.

Even in this modern world of LLM's, to build a system that does something moderately complicated can require very complex code, and intricate engineering tradeoffs that trade between, say, token usage, and answer quality. Latency is another factor that can really ruin your day if you're not careful.

When you're knee deep in building one of those things, you may find yourself having to choose between a prompting strategy like "Chain of Thought" or "Tree of Thought" or "Skeleton of Thought" or "Chain of Feedback" or "Chain of Draft" or... anyway, you get the idea. And it's not just choosing the strategy, it's everything else implied that that choice. For example, using the Skeleton of Thought pattern implies writing additional code to process the "skeletal" answers and then make additional subsequent LLM calls (possibly in parallel) and to assemble the final answer. This isn't something you just knock out in five minutes without serious consideration.

And the same basic premise applies to so many aspects of AI... or really, to digital technology in general. In fact, this whole essay could probably have been written 15 years ago with "Message Queues are not magic" as the title, and 20 years ago with "XML is not magic" as the title, and so on.

If anything, this is a plea to stop, take your time, look beneath the surface, and understand - and truly internalize - the idea that "the Devil is in the details."

Posted at 06:31PM May 14, 2025 by Phillip Rhodes in Artificial Intelligence | Comments [0]

Tuesday April 29, 2025

This blog got ransomwared

I'm generally a big believer in "learning in public" and emphasizing transparency, for numerous reasons. I won't detail those reasons now, but that might make a good blog post in its own right. But for now, I'm just going to share something that happened to me in the process of setting up this blog, where I made a series of mistakes that led to the initial install of my blog getting ransomwared!

Before getting into the nitty gritty, let me just say that the whole ransomware thing was a nothing-burger. The only content on the blog was a "Hello World" post I made basically to test that the Roller install was up and running. Let me also say early on that the attack that happened was successful due solely to my carelessness and is in no way an indictment of Roller, or Tomcat, or Postgresql or any of the other parts of my stack.

Anyway, the story. So I finished writing all the Ansible roles and playbooks, and bash scripts, that I need to deploy this server on Sunday evening. I ran the script, saw the blog server come up successfully and wrote the aforementioned "Hello World" post. It was late, I was tired, and so I opened a text file in the project directory, wrote some notes to myself as far as "punch list" items to finish later and went off to do other things.

The next morning I happened to mention this new blog in a comment on a Hacker News thread (one of those "What are you working on?" threads that pop up from time to time). As a sanity check, I clicked the link I posted... and it didn't work. Huh?

After spinning for a while, I finally got an error message saying "database rollerdb does not exist." I was briefly bewildered since I knew the db was working fine about 12 hours earlier and I knew I had not touched anything since then. A creeping suspicion started to crawl into my consciousness.

I ssh'd into the server, verified that Postgresql was still running, and then fired up psql and listed the databases. And lo and behold, there was no rollerdb database present. Instead there was a database named "readme_to_recover". That nagging suspicion started to become a blaring alarm. But I held out some small hope that this was just a Postgresql failure and that creating a database with that name was part of some error handling routine.

Plowing on, I connected to that database, listed tables and saw one name "readme". Doing a quick `select * from readme` I was greeted with a message approximately like

Your content has been encrypted. To recover, send 0.13 BTC to the following address ...

I'd been ransomwared. First time this has ever happened to me, so I was in a bit of disbelief for a few minutes. And this server had been up for less than 24 hours even!

My thoughts quickly turned to "how did they get in?" and I started investigating the state of my server. Now, I had a hunch pretty early on, and once I'd confirmed the ransomware situation, I basically jumped straight to trying to confirm or deny that hunch. And it went something like this:
"I bet I had Postgreql bound to all IP interfaces on this host by accident, and probably had the postgres user set for "trust" authentication".
Of course that still wouldn't have explained why the firewall didn't block access on port 5432, but it seemed like a good place to start. So I jumped over the Postgresql config dir and checked my postgresql.conf file and found this:

listen_addresses = '*'

At that point the rest was pretty much a foregone conclusion. I moved on to checking the pg_hba.conf and found this line:

host all postgres 0.0.0.0/0 trust

OK, not much question remaining now (aside from "How could I be so stupid", but we'll get to that in a minute) and "what about the firewall?"

I did a quick firewall-cmd --list-ports and was informed that the firewalld service was not running. And there ya go. I tried a "telnet philliprhodes.name 5432" from my laptop and was greeted with a connection banner. Anybody in the entire world could connect to my database, as the admin user, with no authentication. Not my proudest moment, but I kinda knew why this happened, and more to the point, I knew (at least some of) the steps I needed to take to make sure it didn't happen again.

First things first, I edited the postgresql.conf to change it to only bind to 127.0.0.1, and changed the pg_hba.conf to only authenticate tcp connections for user "postgres" from 127.0.0.1 as well. I didn't immediately change the authentication from "trust" to something else only because that would require changes on the application side as well and I just wanted to get things back up and running in a somewhat more secure fashion.

I redeployed everything and then did a server reboot (because I wanted to see if the firewalld service was starting correctly at bootup). After a reboot firewalld was indeed running as expected and port 5432 was no longer open.

So... what exactly happened here? As somebody who has been doing this stuff for 25+ years and who thinks of himself as slightly competent, how did I manage to make such an egregious series of mistakes? And what am I doing in response? Glad you asked.

For starters, I basically know how I got there. I have a habit of sitting in random coffee shops / cafes / etc. and ssh'ing into my server(s) to work on stuff. And at some point, I wanted to do something on one of my servers using psql directly over IP. So I created a postgresql.conf file with that bind configuration, and a pg_hba.conf with that authentication configuration, and deployed those files. And I made everything so generically open so I wouldn't have to worry about knowing the IP address I was on, or dealing with it changing, etc. That's just laziness on my part sadly.

That all almost certainly got reverted on the server later (I say that because none of my other servers have had similar problems, but we'll come back to that later as well). Anyway, those files apparently made it into a directory of sample files I keep around on my laptop to crib from when setting up Postgreql. And I copied those blindly when creating my Ansible role for this. The firewall thing? Something similar I'm sure. It was correctly set to start on boot, so I'm pretty sure I shelled in one day to do something, stopped the service for some temporary manual fiddling, and then just plain forgot to restart it. And the server had not been rebooted since then.

So in the end, I somehow managed to make three ridiculous mistakes, where any two of them alone would probably have NOT resulted in my server getting ransomwared.

So what now? Well, the new server has been running fine for 24+ hours now, so I'm pretty sure whatever script-ransomware-kiddie port scanner that found my vulnerable server is no longer messing with me. But I don't want this to happen again, so what to do? Well, there's probably a million things and I probably don't know all of them, but here are some things I have already done, or plan to do:

Quit doing so much "manual fiddling" on servers in the first place. That's part of the reason I'm going down the Ansible path to begin with. I want all my servers to be completely deployable through "known good" (more or less) automation processes. The goal is to get to where any server I maintain can be completely rebuilt by re-provisioning the underlying VM and then running the associated script that triggers Ansible.
Setup automated backups where the backup data is pulled by another server that is well hardened and is never even mentioned on the server that is being backed up
Add an nmap port-scan to the end of my bash scripts that do server configuration. So every. single. time. I fire the script against a server, the last thing that happens is nmap scans for open ports and dumps that right in my face. I've already implemented this, and had I had this earlier I would have noticed the problem immediately.
Quit using the postgres user for applications.This is a really bad habit of mine and one I need to break. Likewise, I'm going to quit relying on "trust" authentication and start always mandating a password (even for connections from localhost).
Set up routine automated port-scanning for all of my servers. Like I said, none of the others have ever had something like this happen, but I can't assume that everything is perfect on those either. So at some point I'm setting up a scheduled job to run at least a port scan, if not some deeper automated security checks and point it at every server I maintain.
Until (5) is done, I'm going to do a manual audit of all of my other servers for any obvious fuck-ups like this, and fix any such problems I find.
Start running Postgresql on a non-standard port. I know, I know... "security through obscurity". And yet... I strongly suspect I got tagged by an automated scanner just crawling around looking for open ports like 5432 and 3306 and suchlike. There's a decent chance that if I'd been running Postgresql on 5999 or 7379 or something, this wouldn't have happened.

Anyway that's the story of how my shiny new blog got hit with a ransomware attack. Sadly (for the attackers) I won't be paying the ransom. And hopefully the lessons I learned from all of this mean I won't have to in the future either.

Posted at 06:37PM Apr 29, 2025 by Phillip Rhodes in Technology | Comments [0]

Monday April 28, 2025

Hello, Blogosphere!

If you're reading this, congratulations! You've made your way to my new weblog (powered by Apache Roller!). More to come...

Posted at 04:26PM Apr 28, 2025 by Phillip Rhodes in Technology | Comments [0]

Phillip Rhodes' Weblog

Calendar

Feeds

Search

Links

Navigation

Sun	Mon	Tue	Wed	Thu	Fri	Sat
« June 2025
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Today