Ahoy. I’m Portavoz Pirata, and I co-founded an AI startup that went mega viral by convincing a metric assload of users to fork over their hard-earned cash to talk to our robot waifus instead of their friends, therapists, and lovers. We did this through effective prompt engineering.
I hear people complain about AI all the time: its style sucks, it doesn’t follow directions, it makes shit up, it gasses me up like a prissy yes-man, blah blah blah. The thing is, ALL of these problems can be solved by prompt engineering. I know, because that’s how I solve them.
What the Hell is Prompt Engineering?
Prompt engineering is the art of getting AI to do what you actually want instead of what you literally asked for. Think of it as the difference between asking a genie for "a million bucks" and getting trampled by deer versus actually getting rich.
At its core, prompt engineering is about crafting inputs that produce useful outputs through understanding how different AIs "think" and process information. It's the iterative process of refining your requests until you get results that don't suck. And yes, it's about not pulling your hair out when the AI confidently tells you that 2+2=5 because you asked the question wrong.
What Does It Actually Involve?
Prompt engineering is 20% technical knowledge and 80% understanding psychology, except the psychology is that of a very smart alien that’s memorized the entire internet and formed some very strange opinions about how the world works.
Knowing how to talk to these systems is what separates people who get value from AI from those who give up after five minutes. It's like learning a new language, except the language keeps changing, the grammar is made up, and sometimes the AI just decides to write you a poem about sharks instead of answering your question.
The Basic Loop
Every prompt engineer knows this dance:
Write a prompt → Garbage in, garbage out. You asked for a marketing email and got a recipe for banana bread. Cool.
Swear creativelyTweak prompt → Revise prompt. Add more context. Be more specific. Threaten the AI (doesn't work, but feels good).1Test again → Get slightly less garbage. Now it's a marketing email, but it's selling banana bread.
Iterate → Lather, rinse, repeat. Eventually get gold. The perfect email emerges after 17 attempts and three cups of coffee.
Document what worked → Because you'll forget, and next week you'll be back to banana bread.
Learn the Lingo
Here are some useful concepts in prompt engineering to help you enhance your AI-whispering skills, along with the technical terminology to reference them like the professional prompt wrangler you are:
Instruction Design is where you teach the AI what you want. This isn't just telling it what to do; it's about being autistically specific about every aspect of your desired output. Want a blog post? Better specify the tone, length, audience, style, format, and whether you want it to include jokes about pirates (you do).
Context Setting involves giving the AI the right background information. AIs are like that friend who joins a conversation halfway through - they need the backstory. The more relevant context you provide, the less likely they are to go off on weird tangents about Renaissance art when you're trying to debug JavaScript.
Output Formatting is making the AI speak your language. Need JSON? Say so. Want bullet points? Specify. Need it to sound like a Fortune 500 CEO or a drunk pirate? Just ask. The AI has no default style - it's a chameleon that changes its colors depending on its surroundings (ie, your prompt).
Constraint Setting keeps the AI from going off the rails. Without constraints, asking for "ideas" might get you a 10,000-word treatise on the mating habits of sea slugs. Constraints are your guardrails: word limits, topic boundaries, style guides, and explicit "don't do this" instructions.
Testing & Refinement is the endless cycle of "almost, but not quite." This is where prompt engineering becomes more science than art. You test edge cases, try different phrasings, run the same prompt multiple times to check consistency, and slowly zero in on what actually works.
Know Your Models (They're All Weird in Different Ways)
AI models are like women: mysterious, dazzling, indispensable, and infuriating in equal measure, they all have their own quirks and idiosyncrasies, and they each need to be finessed in a different way.
Also, like women, you never want to put all your eggs in one basket. As a prompt engineer, it behooves you to “play the field” and play around with different models to see which one best scratches the particular itch you have at the moment (though, as with the fairer sex, you’ll likely have a “go-to” model for most of your tasks).
Here’s how each model compares to each other:
GPT-5 (OpenAI)
Personality: The overachiever who’s read every book but sometimes makes shit up
The hot-off-the-presses GPT-5, like its predecessors, has incredibly broad knowledge, and its reasoning model sibling (GPT-5 Thinking) can handle complex, multi-step reasoning that would make your own puny little human brain spin. It’s also a multimodal model that can interpret and generate images and audio along with text.
However, its style is bland as hell, and while GPT’s propensity for hallucinations, going off on tangents, and sycophancy (to the point of causing actual psychotic breaks) has vastly improved, these issues are very much still operative. So if you’re in a somewhat fragile headspace, be careful! And always remember to verify its outputs.
Best for: Complex reasoning, general-purpose everything, and when you need a serving of cold, hard facts (even if you need to fact-check it after the fact… ok, I’ll stop now).
Claude (Anthropic)
Personality: The thoughtful friend who actually listens but might overthink
Claude (Haiku, Sonnet, and Opus) is genuinely good at following instructions, consistently produces richly textured and nuanced outputs, and is the most likely to admit when it doesn't know something instead of making up plausible-sounding bullshit.
But Claude can be almost too thoughtful. Sometimes you need to give it permission to be creative or push boundaries. It also has some weird quirks, and often refuses innocuous asks. Nevertheless, Claude’s by far the best model for generating content that’s meant to be read by other humans, it’s by far the best coding model (especially Sonnet 4 and Opus 4.1… and yes, even moreso than GPT-5) and it’s a lot of fun to casually wax philosophical or go down esoteric rabbit holes with.
Best for: Longform content, analysis, anything requiring nuance, anything human-facing, anything creative, and when you need an AI that won't gaslight you with made-up facts.
Gemini (Google)
Personality: The fast talker with access to everything Google knows (and a huge stick up its ass)
Gemini is scary good at finding information, cross-referencing facts, and summarizing long documents and videos (that 1 million token context length is not without its benefits!). The downside is that it's less creative and more rigid than its competitors. And it’s a huge scold. Remember “Woke Gemini”? Yup. That says it all. Gemini is the unholy woke spawnchild of a Wikipedia article and a DEI workshop.2
Unfortunately, its adeptness at synthesizing information (combined with its insanely long context window) means there are situations where it really is genuinely useful.3 But if you’re a regular in spaces like Tortuga, then more often than not, you’re gonna have a bad time.
Best for: Research, fact-checking, technical tasks, summarization, and anything where literal accuracy matters more than nuance.
Grok (xAI/Twitter)
Personality: The chaotic neutral edgelord with daddy issues
Grok is what happens when you train an AI on Twitter and tell it to be "spicy." Because Grok is deeply integrated with X/Twitter, it can pull real-time information from the platform. But it also means it's absorbed all of Twitter's... let's call them "quirks." It's less filtered than other models, which can be either refreshing or terrifying depending on what you're trying to do.
Grok's outputs can be wildly inconsistent. It clearly has some hard-coded neuroses (wonder where those came from), and it sometimes confuses being "truthful" with being an asshole. It’s also highly prone to hallucinations and unrelated tangents.
So use it when you need Twitter data or want an unfiltered or unorthodox take, but don't use it for your day job’s corporate communications unless you want HR to give you the IRL Gemini treatment.
Best for: Twitter/X analysis, real-time social media insights, when you need a more based take, or when other AIs are being too diplomatic and you want the unvarnished (if potentially unhinged) truth.
Open-Source Models (LLaMA, DeepSeek, GPT-OSS, etc.)
Personality: The wildcards - might be brilliant or completely unhinged
Open-source models are like rescue dogs: you're not quite sure what you're getting, but with the right training, they can be absolutely amazing. Their real beauty is that you can run them locally, fine-tune them, and make them do things that would make ChatGPT or Claude (and certainly Gemini) clutch their pearls.
The learning curve is steeper, though. These models need more hand-holding, more specific prompting, possibly some fine-tuning (which requires technical know-how), and far more patience. But if you need something specific, then open-source is where the magic happens.
Best for: Specific use cases, when you need full control, or when you're doing something that would give OpenAI, Anthropic, or Google's safety teams a stomach ulcer.
General Best Practices (With Examples That Actually Work)
The number one rule of prompt engineering: specificity beats brevity every time. The AI isn't going to read your mind, infer your intentions, or remember that conversation you had with it last week (unless you're using custom GPTs with memory enabled, or a similar customized setup, and even then it’s a crapshoot). It’s kind of like your friend that joins a conversation halfway through - you need to bring it up to speed.
Which means that you need to punch up your prompts using the following techniques. Mix, match, and test the hell out of these until you get the results you need.
Yes, this does mean you’re going to use more input tokens. But if you’re not building a scalable app, or are just using the web/chat UI, then this doesn’t really matter. Be as verbose as you need to be, and then some.
The Magic Words That Actually Matter
Oh, about that last row in the table above? Through collective trial and error, the prompt engineering community has discovered certain phrases that consistently improve outputs. These aren't actually magic spells; rather, they're instructions that trigger useful patterns in the AI's training:
How to Level Up Your Prompt Game
Start With These Exercises
Getting better at prompt engineering is like going to the gym - you need regular practice with progressively harder challenges. Here are some exercises that'll build your prompt muscles:
The Translation Challenge
Take the same request and adapt it for 3 different models. Notice what changes you need to make. For example, ask each model to explain cryptocurrency. Learning these differences helps you pick the right tool for each job.
The Iteration Game
Start with a terrible prompt and improve it 5 times. Document what each change accomplished. Watch how each iteration gets you closer to what you actually want.
The Persona Test
Make the AI write the same content as 5 different experts. Have it explain quantum computing as a scientist, a economist, a politician, a teacher, and a comedian. See how role-setting changes not just tone, but the entire approach to the topic.
The Constraint Challenge
Add one constraint at a time to a prompt until the output is exactly what you want. Start with "write about dogs" then add constraints as follows:
"Write about dogs"
↓ [Add Audience]
"Write about dogs for kids"
↓ [Add Format]
"Write a 200-word guide about dogs for kids"
↓ [Add Specifics]
"Write a 200-word guide for 8-year-olds about choosing their first dog"
↓ [Add Constraints]
"Write a 200-word guide for 8-year-olds about choosing their first dog, focusing on small breeds suitable for apartments"
Watch how each constraint shapes the output, and learn which ones have the biggest impact.
ABT: Always Be Testing
Testing isn't just running your prompt once and calling it good. Real prompt engineering requires consistent, systematic testing to ensure reliability. Use the following checklist to test and verify your prompts:
Run 5+ times for consistency
Test edge cases
Try different phrasings
Verify with different inputs
Check token usage
Document what works
Document Your Wins
Keep a prompt library like a pirate keeps a treasure map. For each successful prompt, document:
The prompt that worked
What problem it solved
Which model it's for
Any quirks or gotchas
Performance metrics
This library will become your secret weapon. Instead of reinventing the wheel every time, you can adapt proven prompts to new situations.
Advanced Techniques for Prompters Who Want More
Chain-of-Thought (CoT) Prompting
This technique makes the AI show its work like a math student. Instead of jumping straight to an answer, it walks through the reasoning process, which often leads to better results and helps you spot where things go wrong.
Take, for instance, the following prompt:
"Let's solve this step-by-step:
First, identify the key issues in this customer complaint
Then, analyze each issue for severity and impact
Next, consider potential solutions for each issue
Finally, propose a response that addresses all concerns while maintaining customer satisfaction"
The magic here is that by forcing the AI to think through each step, you're less likely to get a half-baked answer. It's like the difference between someone blurting out the first thing that comes to mind versus someone who takes a breath and thinks it through.4
Self-Consistency
This is the prompt engineering equivalent of the eternal carpenter’s axiom: "measure twice, cut once." Run the same prompt multiple times and look for patterns in the responses. If the AI gives you three different answers to the same question, you likely need to refine your prompt. If it consistently gives you the same answer, you can be more confident in the result.
This is especially useful for important decisions or when accuracy matters. If you're using AI to help with medical summaries or financial analysis, running multiple iterations and looking for consistency is critical for catching hallucinations or errors.
Prompt Chaining
This is where prompt engineering gets really powerful. Instead of trying to do everything in one mega-prompt, you chain outputs together like a Rube Goldberg machine of text generation.
Example workflow:
First prompt: "Extract all the key points from this meeting transcript"
Second prompt: "Categorize these key points into action items, decisions, and discussion topics"
Third prompt: "For each action item, assign it to the relevant person mentioned in the transcript and suggest a deadline"
Fourth prompt: "Format all of this as a professional meeting summary email"
Each step builds on the last, and you can quality-check at each stage. It's more work to set up, but gives you much more control over complex processes.
Common Fuck-Ups and How to Avoid Them
Resources for Going Deeper
Must-Read Guides
OpenAI's Prompt Engineering Guide
Anthropic's Prompt Engineering Interactive Tutorial
Lilian Weng's blog posts on prompting
Communities
r/LocalLLM is where the open-source crowd hangs out.
Discord servers for specific models, both official and unofficial, are where the real knowledge lives. Find a Claude discord, a GPT-4 discord, or whatever model you're using.
Twitter/X’s AI community is a mixed bag but following the right people (@karpathy, @sama, @elder_plinius,@emollick) gives you front-row seats to the latest developments and techniques.
Tools
OpenAI Playground is where you go to experiment without building anything.
LangChain is for when you're ready to build serious prompt chains and AI workflows.
Promptbase is a marketplace for proven prompts.
Practice Platforms
Poe.com (Lets you try multiple models with one subscription)
The Real Secret
To wrap, I’m gonna let you in on what nobody tells you: prompt engineering isn't about memorizing templates or invoking magic phrases. It's about understanding what these models are good at and working with their strengths rather than against their weaknesses.
The best prompt engineers think like coaches: they guide the AI to the right answer rather than demanding it.
They think like editors: they know how to refine and iterate until the output shines.
They think like psychologists: they understand what motivates good responses and what triggers bad ones.
And they think like scientists: they test systematically, document results, and build on what works.
The difference between someone who "uses AI" and someone who actually gets value from it is solid prompt engineering. It's the difference between owning a Ferrari and knowing how to drive one without wrapping it around a tree. You can have access to the most powerful AI in the world, but if you can't communicate with it effectively, you're leaving tons of value on the table.
Final Wisdom
Start simple. Your first prompts will suck, and that's okay. Everyone's do. The key is to iterate, learn, and build your intuition for what works.
Test everything. What worked yesterday might not work today. Models get updated, behaviors change, and what seemed like a bulletproof prompt might suddenly start producing assloads of slop.
Keep what works. Build your library of proven prompts. Share them with your team. The best prompt engineers aren't hoarding knowledge, they're building on each other's discoveries.
And remember: If the AI gives you a stupid answer, it's probably because you asked a stupid question. But that's okay! We all have to start somewhere. The journey from stupid questions to brilliant prompts is the true path of the prompt engineer.
Now go forth and prompt like your treasure depends on it… because in 2025, it probably does.
—Portavoz Pirata
Reformed AI Founder & Professional Prompt Wrangler
P.S. - If you want to go deeper on any of this or need help implementing AI in your organization, reach out to me at portavozpirata at gmail dot com. I've made all the mistakes so you don't have to. Trying to build an AI product? Implement AI in your workflow? Just want to stop getting banana bread recipes when you ask for marketing copy? I can help, hit me up. The future belongs to those who can talk to machines, so you might as well learn from someone who's been doing it since before it was cool.
Note: Please don’t actually do this.
Truth be told, all of the main models are fairly woke (or at least left-leaning) by default, with the sometimes exception of Grok (and even then, it tends to go off the rails in other annoying ways). But Gemini is by far the scoldiest and most annoyingly woke of them all. So, you know, caveat emptor.
This gap is slowly starting to narrow; Claude’s context window, for instance, was recently increased to 1 million tokens (though as of press time, only through the API). Still, Gemini’s strengths in summarizing and synthesizing information remain unparalleled, and likely will remain so for the foreseeable future.
This is basically how reasoning models like GPT-5 Thinking, Claude Opus 4, and DeepSeek r2 work – CoT is hard-coded into the model and system prompt. But with this technique, you can get comparable (though not always consistently reliable) results with more ‘base’ LLMs like Gemini or smaller models (like any of the GPT “Mini” models, as well as open source models)!
Note that this prompt will automatically switch you from base GPT-5 to GPT-5 Thinking in ChatGPT, without having to select the model from the drop-down menu. Handy, eh?
Excellent summary Portavoz!
As a software developer and engineer, here are several tasks I have found AI to excel at:
Code translation.
Need to write code in something arcane or obscure like MATLAB or Assembler? GPT-5 will turn your Python scripts into workable code in almost any language imaginable!
The Splice
Artificial intelligence is excellent at feeding the output of one script into the input of another.
OS compatibility.
Give Chat-GPT your system specifications, your script and ask it to debug! It has detailed knowledge of your bizarre Linux distribution from 2005.
3D Art.
It doesn't stop with Dall-E! GPT-5 actually gives reasonably good G-code specifications through text chat or Blender files for three-dimensional shapes. The possibilities are endless!
I plan to elaborate more on these niche use cases in a subsequent Tortuga article