id stringlengths 3 18 | game stringclasses 1
value | name stringlengths 3 18 | card_text stringlengths 102 224 | has_image bool 2
classes | multimodal_embedding listlengths 1.02k 1.02k |
|---|---|---|---|---|---|
ACCURACY | sts1 | Accuracy | {
"name": "Accuracy",
"type": "Power",
"rarity": "Uncommon",
"color": "silent",
"cost": "1",
"description": "Shivs deal 4 additional damage."
} | true | [
0.014700688421726227,
-0.08631281554698944,
0.02836974896490574,
-0.0491742305457592,
0.016935881227254868,
0.06533639132976532,
-0.09284645318984985,
-0.004728291649371386,
0.013840998522937298,
0.029229437932372093,
0.03077687881886959,
-0.0033098040148615837,
-0.04264059290289879,
-0.00... |
ACROBATICS | sts1 | Acrobatics | {
"name": "Acrobatics",
"type": "Skill",
"rarity": "Common",
"color": "silent",
"cost": "1",
"description": "Draw 3 cards.\nDiscard 1 card."
} | true | [
0.024564918130636215,
-0.09895654767751694,
0.040767308324575424,
-0.06550644338130951,
-0.016202392056584358,
0.037282925099134445,
-0.07979242503643036,
-0.010148272849619389,
-0.020383654162287712,
-0.01045315619558096,
0.0149828577414155,
-0.003070614766329527,
-0.05679548159241676,
-0... |
ADAPTATION | sts1 | Rushdown | {
"name": "Rushdown",
"type": "Power",
"rarity": "Uncommon",
"color": "watcher",
"cost": "1",
"description": "Whenever you enter Wrath, draw 2 cards."
} | true | [
0.030871998518705368,
-0.08515243232250214,
0.053262677043676376,
-0.06344025582075119,
0.014248614199459553,
0.029684612527489662,
-0.01789558120071888,
-0.007548373192548752,
0.005343230441212654,
0.037317801266908646,
-0.003943813033401966,
-0.016793010756373405,
-0.02408694289624691,
-... |
ADRENALINE | sts1 | Adrenaline | {
"name": "Adrenaline",
"type": "Skill",
"rarity": "Rare",
"color": "silent",
"cost": "0",
"description": "Gain [G].\nDraw 2 cards.\nExhaust."
} | true | [
0.0026408361736685038,
-0.05402890965342522,
-0.00034092762507498264,
-0.03255588188767433,
0.0263217780739069,
0.04814114421606064,
-0.04952650144696236,
0.00982737448066473,
-0.019308408722281456,
0.043465565890073776,
-0.0053899032063782215,
-0.041214361786842346,
-0.05368257313966751,
... |
AFTER_IMAGE | sts1 | After Image | {
"name": "After Image",
"type": "Power",
"rarity": "Rare",
"color": "silent",
"cost": "1",
"description": "Whenever you play a card, gain 1 Block."
} | true | [
0.003985612653195858,
-0.09480216354131699,
0.04944717139005661,
-0.02642866037786007,
0.009804179891943932,
0.017306510359048843,
-0.0716131404042244,
0.019778868183493614,
-0.008823762647807598,
0.0014280001632869244,
0.029156779870390892,
-0.03887570649385452,
-0.05319833382964134,
-0.0... |
AGGREGATE | sts1 | Aggregate | {
"name": "Aggregate",
"type": "Skill",
"rarity": "Uncommon",
"color": "defect",
"cost": "1",
"description": "Gain [B] for every 4 cards in your draw pile."
} | true | [
0.000724916229955852,
-0.07160911709070206,
0.016389410942792892,
-0.07329007983207703,
0.022861124947667122,
0.02403780072927475,
-0.053118497133255005,
0.01361581776291132,
0.02353351190686226,
-0.002248290926218033,
0.007102077826857567,
-0.04471367225050926,
-0.07766059041023254,
-0.02... |
ALL_FOR_ONE | sts1 | All for One | {
"name": "All for One",
"type": "Attack",
"rarity": "Rare",
"color": "defect",
"cost": "2",
"description": "Deal 10 damage.\nPut all cost 0 cards from your discard pile into your hand."
} | true | [
0.055822454392910004,
-0.0835624486207962,
0.03441813960671425,
-0.050685420632362366,
-0.01678098365664482,
0.024486536160111427,
-0.03698665648698807,
0.00410962849855423,
0.03732912614941597,
0.0026541349943727255,
0.013013823889195919,
-0.018664563074707985,
-0.04931554198265076,
-0.01... |
ALL_OUT_ATTACK | sts1 | All-Out Attack | {
"name": "All-Out Attack",
"type": "Attack",
"rarity": "Uncommon",
"color": "silent",
"cost": "1",
"description": "Deal 10 damage to ALL enemies.\nDiscard 1 card at random."
} | true | [
0.04346000775694847,
-0.061596862971782684,
0.04517103359103203,
-0.04996189847588539,
-0.027376383543014526,
0.014971459284424782,
-0.0680987536907196,
0.017623545601963997,
0.03165394440293312,
0.015399215742945671,
0.015655869618058205,
-0.014201498590409756,
0.0018607384990900755,
-0.0... |
ALPHA | sts1 | Alpha | {
"name": "Alpha",
"type": "Skill",
"rarity": "Rare",
"color": "watcher",
"cost": "1",
"description": "Shuffle a Beta into your draw pile.\nExhaust."
} | true | [
-0.006376778241246939,
-0.10896916687488556,
-0.007591403089463711,
-0.027415810152888298,
-0.014922529458999634,
0.006506916601210833,
-0.09508774429559708,
0.025160077959299088,
-0.004858497995883226,
0.00997727271169424,
0.03192727267742157,
-0.01752529665827751,
-0.03522410988807678,
-... |
AMPLIFY | sts1 | Amplify | "{\n \"name\": \"Amplify\",\n \"type\": \"Skill\",\n \"rarity\": \"Rare\",\n \"color\": \"defect(...TRUNCATED) | true | [0.06471578776836395,-0.08673948794603348,0.04506387561559677,-0.044725049287080765,0.02439548075199(...TRUNCATED) |
Slay the Spire 1: Multimodal Card Embeddings
Joint text+image embeddings for every card in Slay the Spire (1.0 release), produced by Qwen/Qwen3-VL-Embedding-2B. One unit-normalized 1024-D vector per card. Mechanically AND visually similar cards land near each other; cards across STS1 and STS2 share the coordinate system.
This is the multimodal-embeddings dataset. For text-only embeddings or the underlying card metadata + portraits, see:
t22000t/slay-the-spire-1-cards- metadata + features + inline portrait artt22000t/slay-the-spire-1-card-embeddings- text-only embeddings via Qwen3-Embedding-0.6B
All three are joinable on id. For the STS2 multimodal counterpart, see t22000t/slay-the-spire-2-card-multimodal-embeddings.
The full bundle (6 datasets across both games + 3 Gradio demos) is in the slaythespire-codex collection.
Data Fields
| Field | Type | Description |
|---|---|---|
id |
string | Stable card identifier - the join key |
game |
string | Always "sts1" |
name |
string | Display name |
card_text |
string | Prettified-JSON document fed to the encoder |
has_image |
bool | True when the card had a portrait at embed time |
multimodal_embedding |
list[float32] (1024) | Unit-normalized joint text+image vector |
Embedding recipe
Model:
Qwen/Qwen3-VL-Embedding-2B, frozen, no fine-tuning. The 0.6B text-only encoder used forcard-embeddingsis a different family member; the two repos exist on independent re-version cadences.Image preprocessing: decode PNG → RGB (alpha dropped) → resize-with-pad to 512×512 on neutral grey. Padding rather than center-cropping preserves character iconography at the edges of STS portraits.
Cards without art: fed text-only through the same model. Preserves the joint coordinate system; provenance records
n_with_imageandn_without_image.Truncation: Matryoshka to 1024-D, then re-normalized.
Task instruction prepended at encode time:
Represent this Slay the Spire card so that mechanically similar cards (same archetype, comparable damage/block patterns, related keywords) are close in embedding space, using the card's text mechanics as the primary signal and the portrait art as a secondary cue for character/class and visual archetype.
The "primary/secondary" framing explicitly demotes art so visually similar but mechanically distant cards don't dominate clusters.
Loading
from datasets import load_dataset
import numpy as np
ds = load_dataset("t22000t/slay-the-spire-1-card-multimodal-embeddings", split="train")
emb = np.array(ds["multimodal_embedding"], dtype=np.float32)
print(emb.shape) # (360, 1024)
print(emb @ emb[0]) # cosine similarity to card 0
Cross-game similarity
Same model, same instruction, same dim → STS1×STS2 similarity is a single dot product:
from datasets import load_dataset
import numpy as np
sts1 = load_dataset("t22000t/slay-the-spire-1-card-multimodal-embeddings", split="train")
sts2 = load_dataset("t22000t/slay-the-spire-2-card-multimodal-embeddings", split="train")
emb1 = np.array(sts1["multimodal_embedding"], dtype=np.float32)
emb2 = np.array(sts2["multimodal_embedding"], dtype=np.float32)
i = sts1["name"].index("Bash")
sims = emb2 @ emb1[i]
top = np.argsort(-sims)[:10]
for j in top:
print(f"{sims[j]:.3f} {sts2[j]['name']}")
Considerations for Using the Data
- Lookalike-bias risk. Multimodal embeddings can over-index on visual similarity. The instruction subordinates art to mechanics; if you observe undesirable lookalike clustering, the follow-up is a weighted concat of the separate text and image embeddings rather than a joint encode.
- Cards without art. 1 card (
IMPULSE) has no portrait in the source JAR - its vector is text-only.has_image=Falselets you filter or weight differently. - English only in this snapshot.
- Game IP. Slay the Spire is © Mega Crit. The dataset ships factual reference data + numerical embedding vectors only - no card art bytes are redistributed in this repo (the upstream
cardsrepo is where art lives).
Provenance
A provenance.json ships alongside the parquet with multimodal_embed block recording the model id, embedding dim, task instruction, image preprocessing recipe, n_with_image, n_without_image, and timestamp.
Citation
@dataset{sts1_multimodal_card_embeddings,
title = {Slay the Spire 1: Multimodal Card Embeddings},
author = {timothy22000},
year = {2026},
url = {https://huggingface.co/datasets/t22000t/slay-the-spire-1-card-multimodal-embeddings},
note = {Embedded with Qwen3-VL-Embedding-2B; card data via nkhoit/spire-archive; game IP (c) Mega Crit}
}
Licensing
- Dataset: CC BY 4.0
- Pipeline code: MIT, see github.com/timothy22000/slaythespire-codex
- Embedding model: Apache 2.0 (Qwen/Qwen3-VL-Embedding-2B)
- Game IP: Slay the Spire is © Mega Crit
- Downloads last month
- 69