
A 2023 Google patent describes how AI programs may construct an understanding of companies, manufacturers, merchandise, and different entities from web sites and public knowledge.
The submitting outlines a course of for extracting data, figuring out relationships, and synthesizing what Google calls a “deep, holistic characterization” of an entity.
If programs like this turn into extra influential in search, search engine marketing could more and more contain serving to Google perceive the entity behind your content material, not simply the content material itself.
The shift from paperwork to entities
Google has spent greater than 20 years serving to customers discover data printed on webpages. Whether or not by conventional search outcomes, featured snippets, or AI-generated solutions, the method has typically began with understanding paperwork.
As Google’s search merchandise turn into extra conversational and recommendation-driven, understanding particular person paperwork could now not be sufficient.
Earlier than an AI system can suggest a enterprise, examine merchandise, clarify a model, or counsel a service supplier, it should first perceive the entity behind the content material.
That’s what makes Google’s “Data extraction using LLMs” patent attention-grabbing.
At first look, the patent could look like one other content material extraction system. Engines like google have been extracting data from webpages for years. Nonetheless, Google describes a broader goal.
Based on the submitting:
- “The methods described all through this specification allow synthetic intelligence to generate and improve a deep, holistic characterization of a specific entity.”
Google defines an entity broadly, together with folks, firms, companies, locations, objects, and ideas.
Relatively than merely figuring out info or indexing content material, the system is designed to interpret data, determine relationships, generate summaries, and develop an understanding of the entity represented by that data.

See where your brand appears in AI search, where competitors are winning, and what it takes to become the answer AI recommends.
How Google’s patent creates an understanding of an entity
At a excessive degree, the patent describes a system for gathering data from a number of sources, deciphering that data, and synthesizing an understanding of an entity.

Step 1: Establish the entity
The method begins by figuring out a website and an related entity. The system then gathers data from webpages related to that area and processes it utilizing a man-made intelligence system that features a massive language mannequin (LLM).
Step 2: Interpret the data
Relatively than merely extracting info from particular person pages, the system is designed to generate what the patent calls a characterization of the entity.
Google explains that this characterization is “an interpretation of the extracted first content material and extracted second content material quite than a verbatim duplication of the extracted content material.”
In different phrases, the system goes past gathering data. It interprets that data and types conclusions in regards to the entity behind it.
Step 3: Extract attributes and relationships
The patent additional explains that the AI system can analyze webpages to extract data akin to an entity’s presence, age, rules, providers, status, social media sentiment, and relationships between completely different parts related to the group.
These alerts assist the system transfer past understanding particular person webpages towards understanding the entity itself.
Step 4: Complement with third-party data
Importantly, the patent isn’t restricted to data discovered on an organization’s personal web site. Google notes:
- “The unreal intelligence programs could use on-line maps knowledge, job itemizing knowledge, enterprise data, or different appropriate third-party knowledge as extra or augmenting enter to supply context for producing the characterization that’s output by the factitious intelligence system.”
Taken collectively, the aim seems to be to construct a extra full understanding of the entity than could possibly be obtained from any single webpage.
How the patent represents entities
The system is designed to arrange details about an entity right into a format that may be interpreted, expanded, and utilized by different programs.
Entity summaries
After gathering data from webpages and different sources, the patent describes producing an entity abstract. The examples supplied within the submitting aren’t web page summaries. As an alternative, they learn extra like descriptions of an organization’s id, positioning, values, and traits.
One instance included within the patent describes a hypothetical firm’s model id, noting associations with simplicity, accessibility, belief, innovation, and social accountability.
- “Instance Search Co’s model id is one in every of simplicity, readability, and accessibility. The corporate’s brand, a colourful, sans-serif E, is immediately recognizable and straightforward to recollect. The colour palette can be easy, with a give attention to blue and inexperienced, that are related to belief and reliability. Instance Search Co’s typography can be clear and straightforward to learn, even at small sizes. The general tone of Instance Search Co’s model id is pleasant and approachable. The corporate’s advertising supplies typically function easy, humorous illustrations that assist to make Instance Search Co’s services and products extra relatable to customers. Instance Search Co. additionally emphasizes its dedication to creating data accessible to everybody, no matter their background or technical experience.”
One other instance presents those self same ideas as a set of key attributes quite than a story abstract.
“Listed below are some key points of Instance Search Co’s model id:
– Trustworthiness: Instance Search Co. is thought for its dependable and reliable search engine. The corporate additionally has a powerful dedication to privateness and safety.
– Innovation: Instance Search Co. is continually innovating and releasing new services and products. The corporate is thought for its skill to anticipate person wants and ship progressive options.
– Accessibility: Instance Search Co’s services and products are designed to be accessible to everybody, no matter their background or technical experience.
– Social accountability: Instance Search Co. is dedicated to utilizing its know-how to make a constructive affect on the world. The corporate has quite a few initiatives in place to advertise sustainability, variety, and inclusion.”
What’s necessary right here is the general format. The system takes data distributed throughout a number of sources, transforms it into an interpretation of the entity, and synthesizes it right into a higher-level understanding of the entity.
Entity graphs
Google builds this understanding by hierarchical graph buildings. Based on the patent, the generated characterization can embrace:
- “[A] hierarchical graph construction that features no less than one father or mother node representing a primary attribute of the characterization and no less than one leaf node representing a second attribute of the characterization.”
The accompanying figures from the patent present a greater sense of what this implies in observe.

The determine above exhibits an instance graph generated for a service-based firm.
The determine beneath offers the same instance for a product-based firm. In each circumstances, the system organizes data into related relationships quite than remoted info.

As an alternative of simply realizing {that a} enterprise provides a service, the system associates that service with audiences, areas, status alerts, differentiators, and different associated attributes.
As an alternative of solely figuring out a product, the system may join it to options, classes, use circumstances, and associated choices.
Entity fashions
The patent begins to resemble an entity modeling system greater than a content material extraction system.
- Extracting data solutions one query: What data seems on this web site?
- Entity modeling solutions a unique query: What can we perceive about this enterprise?
That distinction turns into obvious once you take a look at the kinds of data Google says the system can analyze.
The patent particularly references extracting data associated to an entity’s presence, age, rules, providers, status, social media sentiment, and relationships between completely different parts related to the enterprise. It additionally discusses incorporating data from exterior sources akin to maps knowledge, person critiques, enterprise data, and job listings.
Taken collectively, these aren’t simply web site attributes. They’re additionally alerts that assist outline an entity’s id.
The result’s a mannequin that seems able to answering broader questions on a corporation than conventional extraction programs had been designed to handle.
Relatively than figuring out merchandise, providers, or info, the system develops a contextual understanding of who the entity is, what it does, the way it’s perceived, and the way it pertains to different entities.
That is the place the patent turns into notably attention-grabbing for search engine marketing.
Understanding data no matter format
Google has spent years constructing programs that assist machines perceive data on the internet. Structured knowledge, schema markup, product feeds, enterprise listings, and information graphs all exist, partially, to make data simpler to arrange, interpret, and join.
One facet the patent emphasizes repeatedly is the flexibility to extract data that wasn’t particularly structured for machine consumption.
The patent explains that the AI system can extract content material that has “not been structured for parsing by the factitious intelligence system” and may course of data from webpages that haven’t been organized based on the necessities of conventional content material extraction programs.
Google identifies this as one of many main benefits of the strategy.
Based on the submitting, present content material extractors are sometimes restricted to content material that follows predefined buildings, whereas the proposed system can extract and interpret data “regardless of its format.” Relatively than reproducing extracted textual content, the system can generate new content material that interprets and synthesizes the data it finds.
The patent suggests Google is exploring methods to make use of this functionality to construct a extra full understanding of an entity. That understanding isn’t restricted to data discovered on an organization’s personal web site.
The patent explicitly discusses supplementing web site content material with data from maps knowledge, enterprise data, job listings, and different third-party sources.
Taken collectively, the method begins to resemble an entity evaluation system quite than a webpage evaluation system. The web site stays vitally necessary, but it surely’s now not the one supply of fact. As an alternative, the web site turns into one in every of a number of inputs used to assemble an understanding of the entity behind it.
As AI-powered search experiences turn into extra centered on answering questions, making suggestions, and serving to customers consider choices, the standard of these outputs will depend on the standard of the system’s understanding.
Earlier than an AI system can suggest a enterprise, summarize a model, examine merchandise, or clarify why one possibility could also be a greater match than one other, it first wants a mannequin of the entities concerned. The patent describes one attainable strategy for creating that mannequin.
From webpages to entities: What this implies for search engine marketing
Patents don’t inform us precisely how Google will use a know-how. Many patents by no means turn into merchandise, and even after they do, the implementation typically seems completely different from what’s described within the submitting.
What patents can do is reveal how Google is considering an issue. On this case, the issue seems to be understanding entities.
Which will sound acquainted as a result of entity understanding isn’t a brand new idea inside Google Search. Google’s Data Graph, launched greater than a decade in the past, was constructed round connecting entities and relationships.
Extra not too long ago, Google’s emphasis on E-E-A-T, product critiques, enterprise data, and status alerts has mirrored the same goal: understanding not simply what a web page says, however who’s behind it and whether or not that supply may be trusted.
LLMs develop Google’s skill to grasp entities
What makes this patent price inspecting is the function massive language fashions now play in that course of.
This patent describes a course of wherein an AI system can:
- Analyze web sites and public data.
- Interpret the data it finds.
- Synthesize an understanding of an entity with out requiring that data to be introduced in a selected format.
That functionality turns into more and more necessary as Google’s search experiences transfer past doc retrieval.
Think about what’s required for a system like AI Overviews to reply a query about an organization, product, or service. The system should first decide what that entity is, what it provides, who it serves, the way it differs from options, and whether or not it’s related to the person’s question.
The identical problem exists in AI Mode, Gemini, and recommendation-driven experiences akin to Ask Maps. Earlier than an AI system can suggest an entity, it should first perceive it.
That concept seems all through the patent. Google repeatedly describes gathering data from a number of sources, producing summaries, organizing attributes into relationships, and growing an understanding of the entity as an entire.
The patent explains that the system can determine traits akin to providers, status, rules, social sentiment, and relationships between completely different parts related to the entity.

Webpages turn into proof
Via an search engine marketing lens, this means a change in how webpages could perform.
Historically, webpages have been optimized to rank for queries. A service web page targets a service key phrase. A class web page targets a product class. A location web page targets a geographic market. These aims stay necessary.
Nonetheless, if programs just like the one described on this patent turn into extra influential, webpages could more and more serve a second goal. They turn into proof used to assemble an understanding of the entity behind them.
- A service web page does greater than goal a key phrase. It helps set up what providers a enterprise provides.
- A case examine does greater than entice visitors. It demonstrates expertise and experience.
- A workforce web page helps determine the folks behind the group.
- Buyer critiques contribute details about status.
- Press protection, social media, and business references present extra alerts that reinforce or problem the system’s growing understanding.
That is one purpose the patent’s emphasis on a number of knowledge sources is so attention-grabbing. The submitting doesn’t describe constructing an understanding from a single webpage. It describes combining data from web sites, maps knowledge, enterprise data, job listings, and different public sources to create a extra full image of the entity.
Visibility could more and more rely upon entity understanding
The implication right here is that visibility could more and more rely upon how successfully Google understands the entity related to these key phrases. That turns into particularly necessary in environments the place customers are now not selecting from a listing of 10 blue hyperlinks.
When an AI system is summarizing choices, making suggestions, or narrowing decisions on behalf of a person, the standard of its understanding turns into a important think about figuring out which entities are surfaced and the way they’re described.
The problem for search engine marketing could now not be restricted to serving to Google perceive a web page. It might more and more contain serving to Google perceive who you’re.
How manufacturers can affect entity understanding
If Google’s aim is to synthesize an understanding of a enterprise from its web site and different public sources, the sensible query turns into: What can organizations do to assist form that understanding?
The patent means that entity understanding emerges from the buildup and interpretation of knowledge throughout a number of sources quite than any single webpage, profile, or sign.
Whereas the patent doesn’t present optimization suggestions, it does level to a number of areas companies ought to take note of.
Preserve consistency throughout sources
The patent repeatedly references utilizing data from a number of sources to generate a characterization of an entity.
As a result of that characterization is “an interpretation of the extracted first and second content material quite than a verbatim duplication of the extracted content material,” consistency turns into more and more necessary.
Overview how your corporation is described throughout:
- Your web site.
- Enterprise profiles and listings.
- Social media accounts.
- Press protection.
- Recruiting and job postings.
- Trade directories.
The aim isn’t an identical wording in all places. The aim is to make sure AI programs encounter a constant understanding of who you’re, what you do, and who you serve.
Outline the attributes you need related together with your model
The patent’s instance entity summaries give attention to traits akin to trustworthiness, innovation, accessibility, and social accountability.
Ask your self:
- What can we wish to be identified for?
- What differentiates us from opponents?
- What attributes needs to be related to our model?
Examples may embrace:
- Enterprise software program: safety, compliance, and scalability.
- Ecommerce: high quality, worth, and sustainability.
- Native providers: experience, responsiveness, and status.
The clearer these differentiators are communicated, the better they turn into for AI programs to determine and affiliate with the entity.
Assist claims with proof
The patent describes constructing an understanding of an entity from a number of sources. Meaning claims alone could carry much less weight than proof that reinforces these claims.
Examples of supporting proof embrace:
- Buyer critiques.
- Case research.
- Testimonials.
- Press protection.
- Trade citations.
- Awards and certifications.
- Writer profiles and experience alerts.
The aim isn’t merely publishing extra content material. The aim is offering proof that helps the attributes you need related together with your entity.
Strengthen entity relationships
One of many extra attention-grabbing points of the patent is its use of hierarchical graphs to arrange relationships between completely different attributes and ideas.
Companies ought to make it straightforward for serps and AI programs to grasp relationships between:
- Services and products.
- Areas and repair areas.
- Audiences and use circumstances.
- Manufacturers and other people.
- Organizations and industries.
The better these relationships are to determine, the better it turns into for AI programs to grasp the place an entity matches and when it needs to be really useful.
Audit your entity footprint
A helpful train is to ask:
- If an AI system needed to describe our firm utilizing data from our web site, critiques, profiles, listings, and third-party mentions, what wouldn’t it say?
The reply could reveal gaps, inconsistencies, or missed alternatives which can be troublesome to determine when taking a look at particular person pages in isolation.
As AI-powered search turns into more and more centered on understanding and recommending entities, that broader view of your digital presence could turn into simply as necessary as conventional page-level optimization.
What this implies for enterprise, ecommerce, and native companies
One of many strengths of this patent is that it isn’t restricted to a specific kind of entity. Google’s definition is deliberately broad, encompassing companies, organizations, merchandise, locations, ideas, and other people.
That breadth suggests the framework may probably be utilized throughout many alternative search experiences and industries. The challenges related to entity understanding are more likely to fluctuate relying on the kind of enterprise being analyzed.
Enterprise and B2B organizations
Enterprise organizations typically face a consistency problem. Details about the enterprise could also be distributed throughout product pages, investor relations content material, press releases, accomplice web sites, recruiting supplies, analyst experiences, and social media channels. Totally different departments often describe the group in numerous methods.
If AI programs are synthesizing an understanding of the entity from a number of sources, think about:
- Is our positioning constant throughout channels?
- Would an AI system describe our firm the identical approach whatever the supply it analyzed?
- Are our core differentiators clearly communicated and bolstered?
As AI programs more and more interpret data throughout channels, sustaining a coherent entity id could turn into simply as necessary as sustaining a constant model id.
Ecommerce and product-focused companies
The patent’s product-related examples counsel that entity understanding could prolong past organizations to particular person merchandise.
Customers typically ask questions that require analysis quite than retrieval. Relatively than simply looking for a product, they’re asking which product is greatest for a selected use case, funds, viewers, or state of affairs.
For ecommerce manufacturers, think about:
- Are product attributes clearly outlined?
- Are class and product relationships straightforward to grasp?
- Do critiques reinforce product strengths and use circumstances?
- Is supporting content material serving to clarify who a product is for and when it needs to be really useful?
Product data structure, critiques, class relationships, and supporting content material could all contribute to how merchandise are understood and surfaced in AI-driven experiences.
Native companies
Native companies typically face a reputational and specialization problem.
Most of the attributes referenced within the patent align intently with alerts already utilized in native search, together with providers, status, social sentiment, and enterprise data.
For native companies, think about:
- Is your experience clearly communicated?
- Do critiques reinforce the providers and specialties you wish to be identified for?
- Are service areas persistently represented throughout sources?
- Does your web site, Google Enterprise Profile, and third-party presence inform the identical story?
A neighborhood enterprise is greater than a set of service pages. It’s an entity related to particular providers, areas, experience, critiques, and status alerts gathered from throughout the net.
The frequent thread
Throughout enterprise, ecommerce, and native search, the challenges are comparable. Earlier than Google can suggest an entity, examine an entity, or clarify an entity, it should first perceive that entity. The patent offers one of many clearest examples but of how that understanding is perhaps constructed.
Track your visibility across AI search, uncover missed opportunities, and grow your presence where customers are asking questions.
The subsequent evolution of entity understanding
Patents aren’t product bulletins. Google recordsdata 1000’s of patents, and plenty of by no means turn into user-facing options.
Probably the most helpful solution to view this patent isn’t as a roadmap for a future rating algorithm, however as a window into how Google is approaching the problem of understanding entities within the age of LLMs.
All through the submitting, Google repeatedly returns to the identical goal: utilizing AI to gather data from web sites and public sources, interpret that data, and synthesize an understanding of an entity.
In Google’s personal phrases, the methods described within the patent allow synthetic intelligence to “extract content material from an internet site or area and different public sources to synthesize an understanding of a specific entity.”
That goal aligns intently with the route of Google’s newer search experiences. AI Overviews, AI Mode, Ask Maps, and different AI-powered programs all rely upon understanding the companies, merchandise, organizations, and ideas they reference. They consider, summarize, examine, and suggest entities.
For SEOs, that could be an important takeaway. Traditionally, search engine marketing has centered on serving to Google perceive webpages.
Patents like this counsel that the following problem helps Google perceive the entity behind them. That understanding could affect who will get surfaced, who will get cited, and finally, who will get chosen.
#Googles #LLM #patent #suggests #aim #search engine marketing #Educating

