We will present here a research and development project on the topic of artificial intelligence for SEO, which can lead to the creation of a practical tool that helps in the daily optimization work.
The Machine Learning Model
The basic idea is to “do like Google”, which is to go to the root to understand what makes a web page relevant for a search. The task is to “decode” the choices that the Mountain View search engine makes.
The approach, therefore, starts from the SERPs, patiently cataloging all the URLs and keywords of a given semantic area.
Why the semantic area? Because every customer lives on the web in a specific ecosystem made up of topics, competitors, and particular researches. Just as Google treats searches from different areas differently, we also need to analyze that particular ecosystem.
Our analysis allows us to create 3 very important Machine Learning models:
To do all this we have analyzed more than 500 factors that we take directly from SERPs and Web Pages.
The result is a clear and objective vision of which are the most important areas for good positioning.
The model should report categories and individual ranking factors by identifying their relative importance. This allows to prioritize the actions that can actually make the difference between the first and second page.
Furthermore, from such a model we can identify which thresholds are necessary for the individual actions to be performed, facilitating the task of those who put their hand directly to the site.
While for the contents we can both analyze the content of every single page and consequently which topics and words are important to use for each keyword.
The approach, therefore, starts from the SERPs, patiently cataloging all the URLs and keywords of a given semantic area.
Why the semantic area? Because every customer lives on the web in a specific ecosystem made up of topics, competitors, and particular researches. Just as Google treats searches from different areas differently, we also need to analyze that particular ecosystem.
Our analysis allows us to create 3 very important Machine Learning models:
- identify which URLs deserve the first page and for what reason
- identify the Search Intent behind each keyword
- identify which topics and words are decisive within the content
- keep track of all ongoing changes in Google SERP using the best Google Rank Tracker
To do all this we have analyzed more than 500 factors that we take directly from SERPs and Web Pages.
The result is a clear and objective vision of which are the most important areas for good positioning.
The model should report categories and individual ranking factors by identifying their relative importance. This allows to prioritize the actions that can actually make the difference between the first and second page.
Furthermore, from such a model we can identify which thresholds are necessary for the individual actions to be performed, facilitating the task of those who put their hand directly to the site.
While for the contents we can both analyze the content of every single page and consequently which topics and words are important to use for each keyword.
One of the main issues: noise filtering
The huge number of events and data collected from Google’s ecosystem is not something that a human can keep track of, and not even something the usual analysis tool can handle. The best solution to analyzing this data and making informed decisions is to use machine learning algorithms, and it all starting with eliminating the “useless” events and data.
The market is scarce at the moment in technologies for gathering data from vast systems with automation capabilities – the closest approach to what we need being Siscale’s Aiops system.
Reducing noise is just a first step in making the life of SEO easier. Even with a reduced number of events, we need to have an intelligent machine learning approach in the form of process automation.
The market is scarce at the moment in technologies for gathering data from vast systems with automation capabilities – the closest approach to what we need being Siscale’s Aiops system.
Reducing noise is just a first step in making the life of SEO easier. Even with a reduced number of events, we need to have an intelligent machine learning approach in the form of process automation.
The results
The results we might obtain is interesting only in case it validates both the approach and the functioning of the model and its indications.
The results of organic traffic are always the most important, but we have also to be interested in understanding if the “predictions” of the model will be validated on a general level.
In particular, what counts the most is understanding if the pages (of the whole dataset) will actually improve their position.
In the case of the pages identified as “Optimal”, that is, that satisfied the positioning factors found in the best way possible, they should see their situation improved.
At the same time, the less optimized pages will probably lose ground, especially if they are positioned well in the beginning.
The advantage of such a system will be its capability to continuously modify the parameters to perform the needed re-optimization with every significant change in the Google ecosystem.
The results of organic traffic are always the most important, but we have also to be interested in understanding if the “predictions” of the model will be validated on a general level.
In particular, what counts the most is understanding if the pages (of the whole dataset) will actually improve their position.
In the case of the pages identified as “Optimal”, that is, that satisfied the positioning factors found in the best way possible, they should see their situation improved.
At the same time, the less optimized pages will probably lose ground, especially if they are positioned well in the beginning.
The advantage of such a system will be its capability to continuously modify the parameters to perform the needed re-optimization with every significant change in the Google ecosystem.