PICASO - Probabilistic Conceptual Spaces for Sensemaking Under Uncertainty

Research Blog Shoaib Jameel · Ubaid Azam · PICASO Project

Intelligence analysts face a paradox: the data they need to make sense of is vast, ambiguous, and often contradictory — yet the computational tools they rely on demand certainty. PICASO resolves this by making uncertainty a first-class citizen in knowledge representation.

Real-world intelligence analysis is fundamentally challenged by data characterised by immense volume, variety, velocity, and veracity. Analysts routinely confront ambiguous, vague, and conflicting information from disparate sources. Unlike existing methods that merely track factual changes, an analyst's cognitive process involves building complex mental models that intrinsically weigh information by its uncertainty and dynamically refine beliefs as new evidence arrives.

Yet the computational systems we have built — traditional deterministic knowledge graphs — force this complex world into brittle, rigid points. A standard knowledge graph embedding represents every entity as a single coordinate in space: one fixed pin on a map. When one source reports an entity is "in a foreign country" and another says it is "in a city near the border", the system has no principled way to synthesise these vague, non-overlapping statements.

Traditional Approach: Fixed Points

Entities are single coordinates. No notion of confidence. Conflicting evidence is either ignored or averaged into a meaningless midpoint. The model cannot say "I don't know."

PICASO: Probability Clouds

Every entity is a Gaussian distribution with a location (what we think it is) and a spread (how sure we are). Conflicting evidence naturally produces wider, more honest clouds.

The PICASO Framework

PICASO (Probabilistic Conceptual Spaces Sensemaking under Uncertainty) operationalises Peter Gärdenfor's Conceptual Space theory as an end-to-end differentiable probabilistic embedding model. At its core, the PROCS engine replaces every point embedding with a diagonal multivariate Gaussian, where the mean encodes the most likely concept position and the variance explicitly encodes uncertainty from vagueness, noise, and conflicting data.

Left: Traditional embeddings place every entity at a single fixed point — ambiguity has nowhere to go. Right: PICASO models entities as Gaussian clouds; vague reports like "Vehicle" naturally produce wider distributions.

Entities as Gaussians

Each entity is a distribution N(μ, σ²) where the mean is the best-guess position and the variance encodes uncertainty.

Probabilistic Relations

Relations transform distributions via rotation, scaling, and translation — propagating uncertainty through each inference step.

Geometric Scoring

Five complementary scoring functions (geometric, translational, KL, bilinear, complex) jointly constrain the space.

Calibrated Uncertainty

A dedicated calibration loss ensures that stated confidence aligns with actual accuracy — no post-hoc patches needed.

Speed Through Geometry

Bayesian methods are powerful but usually too slow because they rely on sampling — running millions of simulations. PICASO skips the simulations entirely. Because we shape our clouds as Gaussians, we use direct geometric formulas to calculate interactions instantly. The PROCS engine achieves O(1) time complexity with respect to graph size, delivering deep-reasoning accuracy with the speed of standard embeddings.

Benchmark Results: Surpassing the State of the Art

We evaluated PROCS on the standard FB15k-237 and WN18RR benchmarks against established embedding techniques and heavy graph neural networks.

0.417

MRR on FB15k-237

+15.2% vs best baseline

0.339

Hits@1 on FB15k-237

+25.1% vs best baseline

89.0%

Triple Classification Accuracy

+12.6 pts vs R2D2+

0.949

ROC-AUC (FB15k-237)

+9.2 pts vs R2D2+

Model	Methodology	MRR	Hits@1	Hits@3	Hits@10
TransE	Translational Point Embedding	0.294	—	—	0.465
RotatE	Rotation in Complex Space	0.338	0.241	0.375	0.533
TuckER	Tensor Factorisation	0.358	0.266	0.394	0.544
GIE	Geometric Interaction Embedding	0.362	0.271	0.401	0.552
NBFNet	Deep Graph Neural Network	0.415	0.321	0.454	0.599
PROCS (Ours)	Bayesian Conceptual Space	0.417	0.339	0.457	0.559

Link prediction results on FB15k-237. PROCS matches the heavy NBFNet while retaining O(1) inference.

PROCS surpasses the computationally heavy NBFNet on FB15k-237 in both Mean Reciprocal Rank (0.417 vs 0.415) and strict top-1 retrieval (0.339 vs 0.321), achieving this without the overhead of explicit path extraction. Unlike every baseline above, PROCS additionally provides calibrated uncertainty estimates, enabling selective prediction and risk-aware decision support.

Honest Confidence: Calibrated Uncertainty

The defining advantage of PICASO is that when it says "I'm confident", it is more likely to be correct — and when it says "I'm uncertain", you should seek human review. This is validated by uncertainty-stratified accuracy:

Lower-uncertainty bins consistently achieve higher accuracy. When PICASO says it is confident, it is correct far more often — enabling risk-aware filtering for high-stakes decisions.

Real-World Impact: Defence Knowledge Graph

Standard benchmarks reward memorisation. In real defence and intelligence applications, the cost of a false positive — hallucinating a link between a military unit and a location — is unacceptable. To demonstrate PICASO's practical value, we automatically constructed a novel Defence-Wikidata Knowledge Graph.

Defence-Wikidata Knowledge Graph

We scanned over 63 million Wikidata entities and extracted defence-relevant knowledge spanning military units, battles, weapons systems, treaties, operations, intelligence agencies, and personnel.

68,826

Defence Entities

143,532

Relations

355

Relation Types

63M+

Wikidata Entities Scanned

On this challenging, sparse domain (average 3.3 relations per entity), the uncertainty estimates enable selective prediction: by stratifying predictions into confidence tiers, decision-makers can act on confident predictions and defer uncertain ones for human review.

High Confidence

45.1% Hits@1

Medium Confidence

52.4% Hits@1

Low Confidence

26.3% Hits@1

Why This Matters for Decision-Makers

High-confidence predictions achieve 45.1% Hits@1 while low-confidence predictions drop to 26.3%. This 18.8 percentage-point gap means a decision-maker can selectively trust the model's confident outputs and route uncertain predictions to human analysts — a capability no deterministic baseline can offer natively. In high-stakes environments, knowing when not to trust the model is as valuable as the predictions themselves.

Federated Security with FLiP

Defence and intelligence data is inherently sensitive and decentralised — different agencies hold different pieces of the puzzle, and sharing raw data is often prohibited. To address this, we developed FLiP (Federated Lightweight Prompt-tuning), presented at LREC-COLING 2025. FLiP enables collaborative model training across organisational boundaries without ever sharing raw data.

16%

Trainable Parameters

By sharing lightweight prompts instead of full model weights, FLiP reduces the parameter footprint to just 16% of traditional approaches.

90%

GPU Memory Reduction

Cuts GPU memory consumption by 90%, enabling deployment on resource-constrained edge devices common in defence settings.

Zero

Raw Data Shared

Only compact prompt vectors cross organisational boundaries — the underlying sensitive data never leaves its source.

Full

Privacy Compliance

Compatible with differential privacy guarantees and existing information governance frameworks.

Stop Working with Static Points.
Start Working with Intelligent Probabilities.

PICASO is open-source and pip-installable. Explore the code, run the examples, and see probabilistic knowledge representation in action.

View on GitHub Back to Top