The Pirate’s Guide to Prompt Engineering

How to Make AI Work For You

Aug 19, 2025

∙ Paid

Ahoy. I’m Portavoz Pirata, and I co-founded an AI startup that went mega viral by convincing a metric assload of users to fork over their hard-earned cash to talk to our robot waifus instead of their friends, therapists, and lovers. We did this through effective prompt engineering.

I hear people complain about AI all the time: its style sucks, it doesn’t follow directions, it makes shit up, it gasses me up like a prissy yes-man, blah blah blah. The thing is, ALL of these problems can be solved by prompt engineering. I know, because that’s how I solve them.

What the Hell is Prompt Engineering?

Prompt engineering is the art of getting AI to do what you actually want instead of what you literally asked for. Think of it as the difference between asking a genie for "a million bucks" and getting trampled by deer versus actually getting rich.

At its core, prompt engineering is about crafting inputs that produce useful outputs through understanding how different AIs "think" and process information. It's the iterative process of refining your requests until you get results that don't suck. And yes, it's about not pulling your hair out when the AI confidently tells you that 2+2=5 because you asked the question wrong.

What Does It Actually Involve?

Prompt engineering is 20% technical knowledge and 80% understanding psychology, except the psychology is that of a very smart alien that’s memorized the entire internet and formed some very strange opinions about how the world works.

Knowing how to talk to these systems is what separates people who get value from AI from those who give up after five minutes. It's like learning a new language, except the language keeps changing, the grammar is made up, and sometimes the AI just decides to write you a poem about sharks instead of answering your question.

The Basic Loop

Every prompt engineer knows this dance:

Write a prompt → Garbage in, garbage out. You asked for a marketing email and got a recipe for banana bread. Cool.
~~Swear creatively~~ Tweak prompt → Revise prompt. Add more context. Be more specific. Threaten the AI (doesn't work, but feels good).1
Test again → Get slightly less garbage. Now it's a marketing email, but it's selling banana bread.
Iterate → Lather, rinse, repeat. Eventually get gold. The perfect email emerges after 17 attempts and three cups of coffee.
Document what worked → Because you'll forget, and next week you'll be back to banana bread.

Learn the Lingo

Here are some useful concepts in prompt engineering to help you enhance your AI-whispering skills, along with the technical terminology to reference them like the professional prompt wrangler you are:

Instruction Design is where you teach the AI what you want. This isn't just telling it what to do; it's about being autistically specific about every aspect of your desired output. Want a blog post? Better specify the tone, length, audience, style, format, and whether you want it to include jokes about pirates (you do).

Context Setting involves giving the AI the right background information. AIs are like that friend who joins a conversation halfway through - they need the backstory. The more relevant context you provide, the less likely they are to go off on weird tangents about Renaissance art when you're trying to debug JavaScript.

Output Formatting is making the AI speak your language. Need JSON? Say so. Want bullet points? Specify. Need it to sound like a Fortune 500 CEO or a drunk pirate? Just ask. The AI has no default style - it's a chameleon that changes its colors depending on its surroundings (ie, your prompt).

Constraint Setting keeps the AI from going off the rails. Without constraints, asking for "ideas" might get you a 10,000-word treatise on the mating habits of sea slugs. Constraints are your guardrails: word limits, topic boundaries, style guides, and explicit "don't do this" instructions.

Testing & Refinement is the endless cycle of "almost, but not quite." This is where prompt engineering becomes more science than art. You test edge cases, try different phrasings, run the same prompt multiple times to check consistency, and slowly zero in on what actually works.

Know Your Models (They're All Weird in Different Ways)

AI models are like women: mysterious, dazzling, indispensable, and infuriating in equal measure, they all have their own quirks and idiosyncrasies, and they each need to be finessed in a different way.

Also, like women, you never want to put all your eggs in one basket. As a prompt engineer, it behooves you to “play the field” and play around with different models to see which one best scratches the particular itch you have at the moment (though, as with the fairer sex, you’ll likely have a “go-to” model for most of your tasks).

Here’s how each model compares to each other:

GPT-5 (OpenAI)

Personality: The overachiever who’s read every book but sometimes makes shit up

The hot-off-the-presses GPT-5, like its predecessors, has incredibly broad knowledge, and its reasoning model sibling (GPT-5 Thinking) can handle complex, multi-step reasoning that would make your own puny little human brain spin. It’s also a multimodal model that can interpret and generate images and audio along with text.

However, its style is bland as hell, and while GPT’s propensity for hallucinations, going off on tangents, and sycophancy (to the point of causing actual psychotic breaks) has vastly improved, these issues are very much still operative. So if you’re in a somewhat fragile headspace, be careful! And always remember to verify its outputs.

Best for: Complex reasoning, general-purpose everything, and when you need a serving of cold, hard facts (even if you need to fact-check it after the fact… ok, I’ll stop now).

Claude (Anthropic)

Personality: The thoughtful friend who actually listens but might overthink

Claude (Haiku, Sonnet, and Opus) is genuinely good at following instructions, consistently produces richly textured and nuanced outputs, and is the most likely to admit when it doesn't know something instead of making up plausible-sounding bullshit.

But Claude can be almost too thoughtful. Sometimes you need to give it permission to be creative or push boundaries. It also has some weird quirks, and often refuses innocuous asks. Nevertheless, Claude’s by far the best model for generating content that’s meant to be read by other humans, it’s by far the best coding model (especially Sonnet 4 and Opus 4.1… and yes, even moreso than GPT-5) and it’s a lot of fun to casually wax philosophical or go down esoteric rabbit holes with.

Best for: Longform content, analysis, anything requiring nuance, anything human-facing, anything creative, and when you need an AI that won't gaslight you with made-up facts.

Gemini (Google)

Personality: The fast talker with access to everything Google knows (and a huge stick up its ass)

Gemini is scary good at finding information, cross-referencing facts, and summarizing long documents and videos (that 1 million token context length is not without its benefits!). The downside is that it's less creative and more rigid than its competitors. And it’s a huge scold. Remember “Woke Gemini”? Yup. That says it all. Gemini is the unholy woke spawnchild of a Wikipedia article and a DEI workshop.2

Unfortunately, its adeptness at synthesizing information (combined with its insanely long context window) means there are situations where it really is genuinely useful.3 But if you’re a regular in spaces like Tortuga, then more often than not, you’re gonna have a bad time.

Best for: Research, fact-checking, technical tasks, summarization, and anything where literal accuracy matters more than nuance.

Grok (xAI/Twitter)

Personality: The chaotic neutral edgelord with daddy issues

Grok is what happens when you train an AI on Twitter and tell it to be "spicy." Because Grok is deeply integrated with X/Twitter, it can pull real-time information from the platform. But it also means it's absorbed all of Twitter's... let's call them "quirks." It's less filtered than other models, which can be either refreshing or terrifying depending on what you're trying to do.

Grok's outputs can be wildly inconsistent. It clearly has some hard-coded neuroses (wonder where those came from), and it sometimes confuses being "truthful" with being an asshole. It’s also highly prone to hallucinations and unrelated tangents.

So use it when you need Twitter data or want an unfiltered or unorthodox take, but don't use it for your day job’s corporate communications unless you want HR to give you the IRL Gemini treatment.

Best for: Twitter/X analysis, real-time social media insights, when you need a more based take, or when other AIs are being too diplomatic and you want the unvarnished (if potentially unhinged) truth.

Open-Source Models (LLaMA, DeepSeek, GPT-OSS, etc.)

Personality: The wildcards - might be brilliant or completely unhinged

Open-source models are like rescue dogs: you're not quite sure what you're getting, but with the right training, they can be absolutely amazing. Their real beauty is that you can run them locally, fine-tune them, and make them do things that would make ChatGPT or Claude (and certainly Gemini) clutch their pearls.

The learning curve is steeper, though. These models need more hand-holding, more specific prompting, possibly some fine-tuning (which requires technical know-how), and far more patience. But if you need something specific, then open-source is where the magic happens.

Best for: Specific use cases, when you need full control, or when you're doing something that would give OpenAI, Anthropic, or Google's safety teams a stomach ulcer.

General Best Practices (With Examples That Actually Work)

The number one rule of prompt engineering: specificity beats brevity every time. The AI isn't going to read your mind, infer your intentions, or remember that conversation you had with it last week (unless you're using custom GPTs with memory enabled, or a similar customized setup, and even then it’s a crapshoot). It’s kind of like your friend that joins a conversation halfway through - you need to bring it up to speed.

Which means that you need to punch up your prompts using the following techniques. Mix, match, and test the hell out of these until you get the results you need.

Yes, this does mean you’re going to use more input tokens. But if you’re not building a scalable app, or are just using the web/chat UI, then this doesn’t really matter. Be as verbose as you need to be, and then some.