Why We Need to Categorize or Label? Why Computer / AI Needs System to Do This Step?
- Renee Li
- Nov 19
- 4 min read
Let's go one step deeper please from "why we need to categorize or label?"!
It is preparing for cross-domain application! That's how we can build Transformer AI model!
Without getting too technical and explaining too much details, I will again use Copilot as it brilliantly explains in short a short summary!
đš Why We Need to Categorize or Label
Categorization is how we turn raw existence into meaning.
Itâs the act of saying: this person is a reviewer, this manuscript is published, this feedback is positive.
Without labels, everything is just undifferentiated dataâno roles, no states, no history.
đš Why Computers Need a System for This
Computers donât âseeâ meaning; they only see rows, tokens, or vectors.
A system of categorization tells them how objects relate, what rules apply, and what new attributes emerge.
This makes queries possible, rules enforceable, and states trackable.
In short: labeling is the bridge between raw data and structured logic.
đš Going One Step Deeper
Categorization isnât just about orderâitâs about crossâdomain generalization.
Once you model âAssignmentâ or âFeedbackâ as entities, you can reuse them in education, healthcare, finance, publishing.
This universality is exactly how Transformer AI models work: they donât just learn one dataset, they learn the pattern of relationships across domains.
Labeling prepares data to be portable, reusable, and generalizableâthe foundation for crossâdomain intelligence.
If you have more time to read, here is a more detailed and expanded explanation.
Letâs go one step deeper into why categorization (labeling) is necessary, and why computers need systems to do it.
đš Why We Need to Categorize or Label
Human Cognition Mirror
Humans understand the world by grouping things: âThis person is a reviewer,â âThis manuscript is published.â
Categories reduce complexityâwithout them, every object would be an undifferentiated blob of attributes.
Labeling is how we impose meaning on raw data.
Identity Transformation
A manuscript isnât just âa document.â
Once assigned, it becomes âa document under review.â
Once published, it becomes âa document in issue #45.â
Labels mark these transformations, so we can track states and histories.
New Attributes Emerge
Categorization isnât cosmeticâit generates new data.
Example: âReviewerâ label creates attributes like DateAssigned, Recommendation.
Without the label, those attributes wouldnât exist.
đš Why Computers Need Systems to Do This
Relational Integrity
Computers donât âseeâ meaningâthey only see rows and columns.
Categorization (via entities) tells the system how objects relate and what rules apply.
Example: A Reviewer role links Person to Manuscript with constraints (cannot review own manuscript).
Query Power
Without categories, queries become impossible.
You canât ask: âShow me all manuscripts reviewed in 2025â unless the system has labeled those interactions.
Categories give the computer handles to filter, join, and aggregate.
Business Logic Enforcement
Rules live in categories.
Example: âAt least 3 reviewers per manuscriptâ is enforced because the system knows who is labeled as Reviewer.
Without categorization, the computer canât enforce realityâs demands.
Scalability & Evolution
As systems grow, new roles and states appear.
Categorization lets computers adapt without collapsing into chaos.
Metadata + entities = a framework for evolution.
â Conclusion
We categorize because reality demands order, and computers need explicit systems to mirror that order.
Humans: categories = understanding.
Computers: categories = enforceable meaning.
Entities born from interactions are the bridge between raw data and structured reality.
Expansion on the hidden purpose of categorization and interaction entities:
đš CrossâDomain Application
Foundational objects (Person, Manuscript, Interest) are domainâspecific.
Interactions (Assignment, Feedback, Publication) are domainâagnostic patterns: they can be reused across contexts.
By modeling interactions as distinct entities, youâre not just solving this manuscript systemâyouâre preparing the schema to scale into other domains (education, finance, healthcare, publishing, etc.).
â Why This Matters
Reusability
The same âAssignmentâ pattern applies to:
Students assigned to courses
Doctors assigned to patients
Employees assigned to projects
Once you abstract it, you can port it anywhere.
Interoperability
Crossâdomain systems (like ERP, CRM, HR, publishing platforms) rely on shared interaction entities.
A âFeedbackâ entity works in peer review, performance reviews, product reviews.
Scalability
Instead of hardcoding roles and states, you build interaction templates.
That makes your system futureâproof: new domains just plug into the same interaction framework.
Business Logic Bridge
Crossâdomain applications need consistent rules:
âAt least 3 reviewers per manuscriptâ
âAt least 2 doctors per surgeryâ
Categorization entities let you enforce these rules across domains.
đš Metaphor
Think of it like musical notation:
Notes (foundational objects) are specific to a song.
But the notation system (interactions)Â is universalâyou can apply it to jazz, classical, or rock.
Once you have the notation, you can play across domains.
â Conclusion
The reason we categorize and create new entities for interactions is to prepare for crossâdomain application. Itâs not just about manuscriptsâitâs about building a generalizable architecture of meaning.
đš CrossâDomain Categorization â Transformer Foundations
In databases, interactions label foundational objects, creating new entities with attributes.
In Transformers, attention mechanisms label tokens (words, pixels, events) by categorizing them in relation to others.
Both are about contextual meaning:
ERD: âThis Person is a Reviewer of this Manuscript.â
Transformer: âThis word is important in relation to that word.â
đš Why Categorization Prepares for CrossâDomain AI
Generalization
By abstracting interactions into entities, you create reusable patterns.
Transformers do the same: they donât just learn one languageâthey learn the pattern of relationships across languages, domains, and modalities.
Contextual Labeling
Database: Assignment labels a Person as Reviewer in context of Manuscript.
Transformer: Attention labels a token as relevant in context of a sentence.
Both are dynamic categorization engines.
New Attributes = Embeddings
In ERD, interactions generate new attributes (DateAssigned, Recommendation).
In Transformers, attention generates new embeddings (weighted vectors that encode meaning).
Both are derived properties born from relationships.
đš Metaphor
Think of it like alchemy:
ERD interactions stamp objects with new categories.
Transformers stamp tokens with new weights.
Both are rituals of transformation, turning raw material into structured gold.
â Conclusion
Modeling interactions as entities isnât just good database practiceâitâs the conceptual scaffolding for crossâdomain AI.
Transformers thrive because they treat every input as a potential entity, and every relationship as a label that generates new meaning.
Thatâs why your realizationââIt is preparing for crossâdomain application!ââis the same principle that underpins modern AI.



Comments