Glossary

How LLMs Work

Temperature (LLM)

Temperature is a setting that controls how random a language model's output is during generation: low values produce consistent, predictable answers, higher values produce varied, creative ones. It is a key reason the same prompt about a product category can name different brands on different runs of the same AI.

How temperature controls generation

At each step of inference, the model assigns probabilities to every possible next token. Temperature reshapes that distribution before sampling: near zero, the model almost always picks the most probable token, yielding nearly deterministic output; higher values flatten the distribution, letting less likely tokens through and increasing variety.

Consumer chat products typically run at moderate temperatures for natural-sounding conversation, which means answers, including which brands get named, vary meaningfully between runs of the identical prompt.

Temperature and brand mention variability

For recommendation queries, several brands often sit close together in probability. At non-zero temperature, sampling decides which subset appears in any given answer: you might be named in six runs out of ten. Your true visibility is that frequency, not the binary outcome of one check, the statistical root of answer volatility.

A strong brand association in training data and retrieved sources raises your baseline probability, so you survive sampling more often. Weakly associated brands flicker in and out; strongly associated ones appear consistently at any reasonable temperature.

What marketers should take from this

First, never judge AI visibility from single queries, sampling noise will mislead you in both directions. Second, aim to raise your underlying probability: dominant share of voice in source material makes you the high-probability token sequence that low and high temperatures alike select.

Measurement must match the mechanism: repeated daily sampling of the same prompts, aggregated into mention rates. Geonimo's tracking is built on exactly this principle, surfacing your stable visibility level beneath run-to-run randomness.

Frequently asked questions

What does temperature do in an LLM?

Temperature scales the randomness of token sampling during generation. Low temperature makes the model pick its most probable words, giving consistent answers; high temperature admits less likely words, giving varied, creative output. Chat products use moderate values, so identical prompts produce noticeably different answers across runs.

Is temperature why ChatGPT gives different brand recommendations each time?

It is a major reason. With several brands at similar probability, sampling at non-zero temperature selects different subsets per run. Model routing and retrieval changes add further variation. Brands with stronger underlying associations get sampled more often, which is why visibility should be measured as a frequency.

Can I control the temperature of public AI platforms?

Not in consumer products, platforms fix their own settings. API users can set temperature for their applications. For marketers, the lever is not the dial but the distribution: strengthening your brand's presence in training data and retrievable sources raises your probability of being named at any temperature.

Related terms

Last updated: 2026-06-11

Track this for your brand

Geonimo monitors how ChatGPT, Perplexity, Claude, Gemini and Google AI talk about your brand — and generates the content that gets you cited.

Get your free audit