Note
Key shift: Moving from model training as the default to prompt engineering as the default.
Note
Mental model: LLMs = Transformer decoder stacks trained for auto‑regressive generation.
Encoders (BERT) are great for annotation as classification; LLMs add text generation & instruction following.
Note
Pipeline: Pretrained LM → SFT on instructions → better adherence to prompts & JSON schemas.
Write a five-sentence sci-fi plot hook in the style of Dr Seuss:
The Frog King!
Write a five-sentence sci-fi plot hook in the style of Dr Seuss:
He will not die, he will not disappear, he will never fall. He will be the hero that we all want him to be.
The more you can think of this as a fictionalized version of the human condition, the more likely you are to think of it as the
In a land where the stars twinkle bright,
Lived a creature named Zibble, who danced with delight.
But one fateful night, a strange ship did land,
With robots and gadgets all made by hand.
Now Zibble must choose: stay safe in his nook,
Or venture with friends to explore the great book!
Note
Big picture: Transformer pretraining → Instruction tuning (SFT) → RLHF ⇒ the chat experience you expect.
Note
Note
System (stable)
You are a careful annotator. Return ONLY valid JSON:
{"label":"...","rationale":"..."}.
Labels: protest, discrimination, solidarity, uncertain.
Follow definitions neutrally; if ambiguous, use "uncertain".
User (per item/batch)
Decide the label for the text and return ONLY the JSON object.
Text: "Thousands gathered in front of parliament."
Tip
Prototype in a UI (maybe), then freeze the prompt and move to code for reproducibility.
import os, json, requests
API_BASE = os.getenv("LLM_API_BASE") # e.g., https://api.openai.com/v1
API_KEY = os.getenv("LLM_API_KEY")
MODEL = os.getenv("LLM_MODEL", "gpt-4o-mini")
def chat(messages, temperature=0.2, top_p=1.0, max_tokens=256, response_format=None, seed=None):
url = f"{API_BASE}/chat/completions" # adjust if provider differs
headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
body = {
"model": MODEL,
"messages": messages,
"temperature": temperature,
"top_p": top_p,
"max_tokens": max_tokens,
}
if response_format: body["response_format"] = response_format
if seed is not None: body["seed"] = seed
r = requests.post(url, headers=headers, json=body, timeout=60)
r.raise_for_status()
return r.json()
ollama run llama3:8b
(interactive) or ollama serve
(API on :11434
)# Local generation with Transformers (causal LM)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tok = AutoTokenizer.from_pretrained(model_id)
m = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
prompt = "Classify the following text as protest / discrimination / solidarity: ..."
inputs = tok(prompt, return_tensors="pt").to(m.device)
out = m.generate(**inputs, max_new_tokens=128, temperature=0.2, do_sample=True, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))
Authorization: Bearer <key>
# Example: POST to OpenRouter (adjust model name as needed)
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3-70b-instruct",
"messages": [{"role": "user", "content": "Classify: ..."}],
"temperature": 0.2,
"top_p": 0.9
}'
You are a careful social science annotator.
Task: assign one of {protest, discrimination, solidarity}.
Return ONLY JSON: {"label": "...", "rationale": "..."}
Guidelines:
- protest: ...
- discrimination: ...
- solidarity: ...
Edge cases: ...
Tip
For annotation, start with temperature = 0–0.2, top_p = 0.9–1.0, seed fixed, max_tokens sized for your JSON.
import os, json, pandas as pd, requests, time
API_BASE = os.getenv("LLM_API_BASE")
API_KEY = os.getenv("LLM_API_KEY")
MODEL = os.getenv("LLM_MODEL", "gpt-4o-mini")
def classify(text):
messages = [
{"role": "system", "content": "You are a careful social science annotator. Return ONLY JSON."},
{"role": "user", "content": f"Task: assign one of {{protest, discrimination, solidarity}}.\nText: {text}\nReturn JSON with fields: label, rationale."}
]
r = requests.post(f"{API_BASE}/chat/completions",
headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"},
json={"model": MODEL, "messages": messages, "temperature": 0.1, "top_p": 0.95, "max_tokens": 200, "seed": 7})
r.raise_for_status()
content = r.json()["choices"][0]["message"]["content"]
try:
return json.loads(content)
except Exception:
# simple repair pass
content = content.strip().splitlines()
content = "\n".join([ln for ln in content if ln.strip().startswith("{") or ln.strip().startswith('"') or ":" in ln])
return json.loads(content)
df = pd.DataFrame({"text": [
"Thousands gathered in front of parliament.",
"Volunteers cleaned the park and cooked for neighbors.",
"He yelled slurs at a woman on the tram."
]})
out = []
for t in df["text"]:
try:
out.append(classify(t))
except Exception as e:
out.append({"label": None, "rationale": f"ERROR: {e}"})
time.sleep(0.1) # be polite / rate limits
df_out = pd.concat([df, pd.json_normalize(out)], axis=1)
print(df_out)